Links to the main figures and tables referenced in the main paper:

1 Introduction

This document (irrespective of its format, most probably HTML or PDF) resulted from compiling the corresponding Rmarkdown script and contains all the results and plots supporting the paper (its Supplementary Materials). The primary data are available upon request from the corresponding author (Manxiang Wu ), as instructed in the paper, but all the code needed to reproduce this document are available in the GitHub repository https://github.com/ddediu/tone_ax_dong.

Please note that this Rmarkdown script caches some very expensive computations in the ./cached_results directory, but it can be manually forced to recompute everything (setting the variable FORCE_COMPUTE_ALL to TRUE) and it also forces a full recomputation if any of the input files (in the directory ./input_files) was changed, deleted or added (or if this is the first time this is run). However, it is highly recommended to not run such a full recomputation during the knitting of the Rmarkdown, but instead in a “normal” R session using, for example, the “Run ▾” → “Run all” menu in RStudio (knitting the whole thing seems to generate crashes due to memory issues). Also, this full recomputation should be done on a powerful machine (with at least 32Gb RAM and a 4 cores CPU) running Linux or macOS (to fully use multicore parallelism; moreover, this script was not tested on Windows) and it may take a while (it takes about 3 hours on an AMD Ryzen 7 3700X with 64Gb RAM), but the subsequent knitting or small changes can be run on a “normal” machine as the expensive results are cached for later use.

1.1 Typographic conventions

This document uses the following font and color conventions:

  • regular text is rendered as “regular text”;
  • emphasis is represented using italic text, bold text or bold+italic text;
  • software or programming concepts (e.g., applications, packages or function names) are represented using fixed font text;
  • section heads use specific font sizes and are numbered;
  • hyperlinks to sections of this document and to external resources on the web are represented as link to the Introduction or link to R project’s website, and can be clicked to navigate there;
  • notes are represented as numbered superscripts1 which can be clicked to go to the note’s text;
  • captions use colored font, are numbered, and are placed below the corresponding figure and above the corresponding table;
  • raw output, as produced by various R functions and expressions, is shown using fixed font text in clearly marked boxes;

1.2 Software and hardware info

The full information about the version of R (R Core Team, 2023), the packages and the hardware and software platform used to obtain this document are given in the Section Session information at the end of this document2.

2 The data

2.1 The population and language

We collected data from native speakers of a Southern dialect of Kam, described in detail in (Wu, 2018), probably Glottolog sout2741, which is characterized by a very complex tone system (see Wu, 2018, pp. 28–36) with 10 phonemic (i.e., contrastive) tones realized as 15 phonetic tones (and effected by various tone sandhi rules and language contact-induced ongoing changes). As expected, WALS assigns a “Complex tone system” to this variety (see “Chapter 13A” for Language Dong (Southern)); unfortunately, neither PHOIBLE not LAPSyD seem to contain any information about it.

In total, we collected usable data from 492 unique participants, from which we further excluded 2 participants who reported hearing problems (no participant reported brain or cognitive impairment), leaving a total of 490 participants in the sample.

Concerning self-declared gender, there are 331 (67.6%) self-declared females and 159 (32.4%) self-declared males:
**Figure S1.** Distribution of *gender* in the sample. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS1'/>

Figure S1. Distribution of gender in the sample. Figure generated using R version 4.3.3 (2024-02-29)

At the time of data collection, the age of the participants was distributed between 15 and 72 years, with a mean of 40.4 (and median 44) and standard deviation of 12.6 (and interquartile range, IQR, 18):
**Figure S2.** Distribution of *age* overall (thick black curve) and by *gender* (colored transparent curves) in the sample. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS2'/>

Figure S2. Distribution of age overall (thick black curve) and by gender (colored transparent curves) in the sample. Figure generated using R version 4.3.3 (2024-02-29)

It can be seen that the sample has 2.1 times more females than males (the χ2 test against the expected 50%:50% distribution is highly significant: χ2(1) = 60.4, p=7.84×10-15), but the ages are distributed in similar ways between the two genders, with a bi-modal distribution suggesting two age groups: one composed of adolescents and young adults (centered around 20 years of age and ranging between the minimum of 15 and about 30 years old) and the other (“adults”) centered around late 40s.

We used the location as a proxy for any relevant socio-linguistic dimensions of variation, and we ended up with participants from 5 locations:

B A C D E
237 227 1 1 1

The vast majority of participants comes from two neighboring locations A and B (about 8km apart), speaking very similar dialects of Kam (Manxiang Wu, pc); therefore we collapsed the remaining locations into an “other” category. Please note that for 23 (4.7%) participants this information is missing.

We also were able to retrieve some information concerning familial relationships for 91 (18.6%) participants, grouped in 28 nuclear families across a maximum of 3 generations: generation 0 comprises the youngest members, 1 is their parents’ generation, and 2 is that of their grandparent’s generation:
**Figure S3.** Distribution of the participants with information about family by generation: 0=youngest (black), 1=their parents (gray) and 2=their grandparents (light gray). The families are identified with an arbitrary unique numerical ID. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS3'/>

Figure S3. Distribution of the participants with information about family by generation: 0=youngest (black), 1=their parents (gray) and 2=their grandparents (light gray). The families are identified with an arbitrary unique numerical ID. Figure generated using R version 4.3.3 (2024-02-29)

It can be seen that there is only one participant in generation 2, so we collapsed generations 1 and 2, resulting into a binary split into a “young” and an “older” generation.

young older
72 19

2.2 The covariates

We considered three covariates here (as age has already been covered above, we focus on the remaining two).

2.2.1 Years of musical traning

This variable (music_years) is self-declared and is distributed as follows:
**Figure S4.** Distribution of *music_years* overall (thick black curve) and by *gender* (colored transparent curves) in the sample. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS4'/>

Figure S4. Distribution of music_years overall (thick black curve) and by gender (colored transparent curves) in the sample. Figure generated using R version 4.3.3 (2024-02-29)

It is clear that in our sample it does not capture any interesting pattern of inter-individual variation, so we will ignore it in the following analyses.

2.2.2 Years of formal education

This variable (education_years) is self-declared and is distributed as follows:
**Figure S5.** Distribution of *education_years*by *gender* in the sample. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS5'/>

Figure S5. Distribution of education_yearsby gender in the sample. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S6.** Distribution of *education_years* by *location* in the sample. *NA* means that the location information was not available. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS6'/>

Figure S6. Distribution of education_years by location in the sample. NA means that the location information was not available. Figure generated using R version 4.3.3 (2024-02-29)

Overall, gender makes a significant difference (t(394.2)=-6.4, p=3.5×10-10; Mann-Whitney W=1.75565^{4}, p=1.75×10-9), with males having 2.35 years of formal education more than females on average (i.e., 7.64 for males vs 5.29 for females).

Focusing on the two locations that comprise the majority of the participants, location A has significantly higher educational levels that location B (t(461.9)=4, p=8.05×10-5; Mann-Whitney W=3.25335^{4}, p=8.24×10-5), with the participants from location A having 1.53 years of formal education more than the participants from B on average (i.e., 6.67 vs 5.14).

We performed the linear regression of education_years on age, gender and their interaction, and we found that this model behaves well (diagnostic plots not shown), and that it explains adjusted R2 = 48.7% of the variance, and that all terms are significant: age has a highly significant negative main effect (β = -0.264, p = 7.22×10-60), gender has a very large and highly significant main effect with the males having less years of education than females (β = -3.758, p = 8.23×10-5), but there is a highly significant interaction between the two (β = 0.138, p = 2.92×10-9) that offsets the main negative effect of gender into an advantage for males versus the females:
**Figure S7.** Predictive plot of the linear regression of *education_years* on *gender*, *age* and their interaction. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS7'/>

Figure S7. Predictive plot of the linear regression of education_years on gender, age and their interaction. Figure generated using R version 4.3.3 (2024-02-29)

2.3 The measures

2.3.1 The working memory task

The working memory task consists of 15 trials; in each trial a sequence of colors (each color appears only once in a trial) are shown to the participant and the participant has to reproduce the colors in the correct order, the score representing the number of colors reproduced in the correct position. For example, trial 1 shows “red”, “green” and “blue”, and if a participant reproduces “yellow”, “green”, “red”, her score is 1 (from “green”). The trials are the same across participants and vary from 3 to 7 colors, as show below:

Trial Length Color 1 Color 2 Color 3 Color 4 Color 5 Color 6 Color 7
1 3 red green blue
2 3 black purple yellow
3 3 gray green black
4 4 red blue purple gray
5 4 yellow black green blue
6 4 green red black yellow
7 5 blue black gray yellow red
8 5 gray purple yellow green blue
9 5 black red blue gray green
10 6 green black purple blue gray yellow
11 6 yellow purple black red green gray
12 6 gray purple blue red green yellow
13 7 red green blue black purple yellow gray
14 7 blue gray black green red purple yellow
15 7 yellow red blue green gray black purple
Table S1. The working memory trials showing their length and the color sequence. Each color may appear at most once in any give trial but can repeat across trials. The trials are fixed across all participants.
The distribution of scores across participants by trial is:
**Figure S8.** Distribution of the *working memory scores* across the trials by gender (showing the actual counts with the females stacked on top of the males). Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS8'/>

Figure S8. Distribution of the working memory scores across the trials by gender (showing the actual counts with the females stacked on top of the males). Figure generated using R version 4.3.3 (2024-02-29)

gender makes a significant difference (t(4798.1)=-4.3, p=2.04×10-5; Mann-Whitney W=5.521666^{6}, p=1.88×10-6), with males scoring slightly higher across trials than females by 0.17 points on average (i.e., 2.29 for males vs 2.11 for females).

The three trials with the same length visually seem to have similar behaviors, but there are nevertheless significant differences between them:

Length Trials ANOVA by trial Signif. pairwise diffs.
3 01, 02, 03 F(2, 1467)=13.54, p=1.5×10-6 trial 01 is easier
4 04, 05, 06 F(2, 1467)=0.25, p=0.782 all trials are similar
5 07, 08, 09 F(2, 1467)=13.18, p=2.12×10-6 trial 09 is easier
6 10, 11, 12 F(2, 1467)=3.76, p=0.023 trial 12 is easier than trial 10
7 13, 14, 15 F(2, 1467)=13.05, p=2.41×10-6 trial 14 is harder
Table S2. Comparing the trials of the same length using one-way ANOVA with posthoc Tukey pairwise comparisons (details not shown here but summarized in the last column).

We conducted both a Principal Component Analysis (PCA) and an Exploratory Factor Analysis (EFA) on all trials together, and we found that there seems to be a single factor.

For PCA, PC1 explains 40.2% of the variance, followed by PC2 which explains only 6.1%, suggesting that all trials load on a single latent variable:

**Figure S9.** Screeplot of the PCA of all the working memory trials together. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS9'/>

Figure S9. Screeplot of the PCA of all the working memory trials together. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S10.** Loading of the working memory trials on the first 2 PCs. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS10'/>

Figure S10. Loading of the working memory trials on the first 2 PCs. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S11.** The participants plotted on the first 2 PCs, colored by their their qualities of representation (cos2). Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS11'/>

Figure S11. The participants plotted on the first 2 PCs, colored by their their qualities of representation (cos2). Figure generated using R version 4.3.3 (2024-02-29)

For EFA, all the preliminary tests suggest that factor analysis is appropriate (Kaiser-Meyer-Olkin = 0.95 > 0.60; Bartlett’s test is significant: χ2(105)=2297.1, p=0; and det(cor(data))=0.0086 > 0) and all the recommended methods for finding the appropriate number of factors suggest that 1 factor is enough:

**Figure S12.** Screeplot of the observed, simulated and randomized data with 1 standard deviation error bars (as generated by `fa.parallel())`. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS12'/>

Figure S12. Screeplot of the observed, simulated and randomized data with 1 standard deviation error bars (as generated by fa.parallel()). Figure generated using R version 4.3.3 (2024-02-29)

**Figure S13.** Number of factors as suggested by the VSS criterion (top left), the complexity of the solution (top right), BIC (bottom left) and Root Mean Residual (bottom right), as implemented by `nfactors()`. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS13'/>

Figure S13. Number of factors as suggested by the VSS criterion (top left), the complexity of the solution (top right), BIC (bottom left) and Root Mean Residual (bottom right), as implemented by nfactors(). Figure generated using R version 4.3.3 (2024-02-29)

Even if this 1-factor model does not seem to formally be sufficient to explain all of the variance in the data (about 35.8% of the variance explained, but χ2(90)=131.0, p=0.0031 < 0.05), the loadings are very similar:
**Figure S14.** Loadings of the variables in the 1 factor model. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS14'/>

Figure S14. Loadings of the variables in the 1 factor model. Figure generated using R version 4.3.3 (2024-02-29)

We also implemented a Confirmatory Factor Analysis (CFA) with all trails loading on a single latent wm variable, and we found that while the model formally does not fit the data (χ2(133)=90.0, p=0.0022 ≤ 0.05), its fit indices are ok (CFI=0.98, TLI=0.98, RFI=0.93, RMSEA=0.03) and the path coefficients suggest that all trials load in similar ways on the wm latent:

Figure S15. Confirnatory factor analysis (CFA) of the working memory trislas with a single latent factor wm. Figure generated using R version 4.3.3 (2024-02-29)

Given all these, it makes sense to compute a total score from all the trials (i.e., using the same weight of 1.0); we further normalized it between its minimum possible score of 0.0 and its maximum of 3 × (3+4+5+6+7) = 75 (this variable will be denoted as wm_norm).

This variable is distributed as follows:
**Figure S16.** **A**: distribution of *normalized working memory score* (*wm_norm*) overall (thick black curve) and by *gender* (colored transparent curves) in the sample. **B**: relationship between *wm_norm* and *age* by *gender* with linear regression lines (and 95%CIs). **C**: Relationship between *wm_norm* and *education_years* by *gender* with linear regression lines (and 95%CIs). Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS16'/>

Figure S16. A: distribution of normalized working memory score (wm_norm) overall (thick black curve) and by gender (colored transparent curves) in the sample. B: relationship between wm_norm and age by gender with linear regression lines (and 95%CIs). C: Relationship between wm_norm and education_years by gender with linear regression lines (and 95%CIs). Figure generated using R version 4.3.3 (2024-02-29)

We performed the linear regression of wm_norm on age, the number of years of formal education (education_years), gender (the reference level being females) and all their interaction, and, following manual simplification, we found that this model behaves well (diagnostic plots not shown), that it explains adjusted R2 = 50.6% of the variance, and that age has a highly significant negative effect (β = -0.006, p = 5.62×10-17), education_years has a highly significant positive effect (β = 0.021, p = 3.95×10-22), and gender shows a significant difference between males and females, with males having overall smaller score than females (β = -0.032, p = 0.025).

2.4 The tone task

The tone task is a AX task in which the participant is presented, in a given trial, with a pair of syllables that may differ only in tone, and has to decide if the two syllables are the “same” or “different”. For a given tone pair (let’s say “a” and “b”), there are two syllables with different segmental content (let’s say, “A” and “B”), resulting in the following four syllables+tone combinations: Aa, Ab, Ba and Bb. With these, we have the following possible trials:

  • “same”: AaAa, AbAb, BaBa and BbBb – the tones and segmental content of the two syllables are the same (these are denoted as “same_1”, “same_2”, “same_3” and “same_4”),
  • “different”: AaAb, AbAa (the order differs), BaBb, BbBa (the order differs), resulting in 4 pair × order trials (denoted as “order_1” and “order_2”, and “pair_1” and “pair_2”, respectively).

Each trial was repeated twice and the repose to a given trial was scored as “correct” if the tones were different and the repose was “different”, or if the tones were the same and the response was “same”, and was scored as “incorrect” otherwise. Each participant was presented with a random (unique) order of the trials. Please note that some of the possible trials were not included in the task, as the participants showed a ceiling effects during the pilot study.

We used the 9 (of the 10) phonological tones in the language that occur in unchecked syllables (Wu, 2018), represented here by letters:

Table S3. Representing tones (column names, using the numeric notation with 5 levels) by letters (first row). Please see Wu (2018) and Donohue & Wu (2013) for details about the tones in this language, and, for example, Yip (2002), for a general introduction to tone.
23 33 332 52 452 35 335 32 212
c h k l p s t v x

Likewise, in the name of brevity, we denote the segmental content of the syllables using CAPITAL letters, as follows:

Table S4. Uniquely mapping the segmental content of the syllables used onto CAPITAL letters (for notational brevity).
khem koi pem pju poi pui sai sɐm sem seu səu som teu thɐŋ
A B C D E F G H I J K L M N

The actual stimuli used are listed below:

Table S5. The 40 stimuli used in the ‘same’ task, showing the segments and tone, orded alphabetically; some are actual words in the language (also maked using bold). Please note that the actual trials present pairs of such identical stimuli.
Syllabe Segments Tone Tone letter Is real word? Short name
khem332 khem 332 k no Ak
khem335 khem 335 t no At
koi33 koi 33 h no Bh
koi35 koi 35 s no Bs
koi32 koi 32 v no Bv
koi212 koi 212 x no Bx
pem33 pem 33 h no Ch
pem32 pem 32 v yes Cv
pju23 pju 23 c no Dc
pju33 pju 33 h no Dh
pju212 pju 212 x no Dx
poi23 poi 23 c no Ec
poi52 poi 52 l no El
poi212 poi 212 x no Ex
pui35 pui 35 s no Fs
pui32 pui 32 v no Fv
sai335 sai 335 t no Gt
sai212 sai 212 x no Gx
sɐm35 sɐm 35 s yes Hs
sɐm335 sɐm 335 t no Ht
sem23 sem 23 c no Ic
sem332 sem 332 k no Ik
sem452 sem 452 p no Ip
sem35 sem 35 s no Is
sem335 sem 335 t no It
sem32 sem 32 v no Iv
sem212 sem 212 x no Ix
seu332 seu 332 k no Jk
seu212 seu 212 x no Jx
səu332 səu 332 k no Kk
səu52 səu 52 l no Kl
som332 som 332 k no Lk
som52 som 52 l no Ll
som452 som 452 p no Lp
som32 som 32 v no Lv
teu23 teu 23 c no Mc
teu33 teu 33 h no Mh
teu52 teu 52 l yes Ml
thɐŋ332 thɐŋ 332 k no Nk
thɐŋ452 thɐŋ 452 p no Np
Table S6. The 52 pairs of stimuli used in the ‘different’ task, showing the segments and tone, orded alphabetically; some are actual words in the language (also maked using bold). Please note that the reverse order of the stimuli is not shown.
Syllabe 1 Syllabe 2 Segments 1 Tone 1 Tone letter 1 Is real word 1? Segments 2 Tone 2 Tone letter 2 Is real word 2? Short name
khem332 khem335 khem 332 k no khem 335 t no Akt
khem335 khem332 khem 335 t no khem 332 k no Atk
koi33 koi212 koi 33 h no koi 212 x no Bhx
koi35 koi32 koi 35 s no koi 32 v no Bsv
koi32 koi35 koi 32 v no koi 35 s no Bvs
koi212 koi33 koi 212 x no koi 33 h no Bxh
pem33 pem32 pem 33 h no pem 32 v yes Chv
pem32 pem33 pem 32 v yes pem 33 h no Cvh
pju23 pju33 pju 23 c no pju 33 h no Dch
pju23 pju212 pju 23 c no pju 212 x no Dcx
pju33 pju23 pju 33 h no pju 23 c no Dhc
pju212 pju23 pju 212 x no pju 23 c no Dxc
poi23 poi52 poi 23 c no poi 52 l no Ecl
poi23 poi212 poi 23 c no poi 212 x no Ecx
poi52 poi23 poi 52 l no poi 23 c no Elc
poi52 poi212 poi 52 l no poi 212 x no Elx
poi212 poi23 poi 212 x no poi 23 c no Exc
poi212 poi52 poi 212 x no poi 52 l no Exl
pui35 pui32 pui 35 s no pui 32 v no Fsv
pui32 pui35 pui 32 v no pui 35 s no Fvs
sai335 sai212 sai 335 t no sai 212 x no Gtx
sai212 sai335 sai 212 x no sai 335 t no Gxt
sɐm35 sɐm335 sɐm 35 s yes sɐm 335 t no Hst
sɐm335 sɐm35 sɐm 335 t no sɐm 35 s yes Hts
sem23 sem452 sem 23 c no sem 452 p no Icp
sem332 sem35 sem 332 k no sem 35 s no Iks
sem452 sem23 sem 452 p no sem 23 c no Ipc
sem452 sem32 sem 452 p no sem 32 v no Ipv
sem452 sem212 sem 452 p no sem 212 x no Ipx
sem35 sem332 sem 35 s no sem 332 k no Isk
sem35 sem335 sem 35 s no sem 335 t no Ist
sem335 sem35 sem 335 t no sem 35 s no Its
sem32 sem452 sem 32 v no sem 452 p no Ivp
sem212 sem452 sem 212 x no sem 452 p no Ixp
seu332 seu212 seu 332 k no seu 212 x no Jkx
seu212 seu332 seu 212 x no seu 332 k no Jxk
səu332 səu52 səu 332 k no səu 52 l no Kkl
səu52 səu332 səu 52 l no səu 332 k no Klk
som332 som52 som 332 k no som 52 l no Lkl
som52 som332 som 52 l no som 332 k no Llk
som52 som452 som 52 l no som 452 p no Llp
som452 som52 som 452 p no som 52 l no Lpl
som452 som32 som 452 p no som 32 v no Lpv
som32 som452 som 32 v no som 452 p no Lvp
teu23 teu33 teu 23 c no teu 33 h no Mch
teu23 teu52 teu 23 c no teu 52 l yes Mcl
teu33 teu23 teu 33 h no teu 23 c no Mhc
teu33 teu52 teu 33 h no teu 52 l yes Mhl
teu52 teu23 teu 52 l yes teu 23 c no Mlc
teu52 teu33 teu 52 l yes teu 33 h no Mlh
thɐŋ332 thɐŋ452 thɐŋ 332 k no thɐŋ 452 p no Nkp
thɐŋ452 thɐŋ332 thɐŋ 452 p no thɐŋ 332 k no Npk

Please note that stimuli sem35:sem335 (Ist), sɐm35:sɐm335 (Hst) are considered as “difficult” by the task designers in the sense that it is hard to hear the difference in tones even for these highly trained speakers of a tone language (Manxiang Wu, p.c.).

2.4.1 Descriptives

We begin this analysis based on the “6 steps” approach of Dima (2018) and the accompanying R code available at https://github.com/alexadima/6-steps-protocol.

2.4.1.1 Percent correct responses

Please note that we will a short notation for the items composed of the segment CAPITAL letter, followed by the one (for ‘same’) or two (for ‘different’) letter tone notation, and the presentation number (e.g., Its2 is the 2nd presentation of the ‘different’ item sem335:sem35).

Table S7. Frequencies of ‘yes’ responses (the items are ordered by % correct responses). The star (*) denotes those itmes that contain real words in the language.
Item (short name) # correct responses % correct responses
Its2 68 13.8%
Hst2* 74 15.0%
Hst1* 77 15.7%
Hts2* 80 16.3%
Its1 88 17.9%
Ist2 89 18.1%
Ist1 90 18.3%
Hts1* 97 19.7%
Bxh2 99 20.1%
Bxh1 104 21.1%
Bhx2 110 22.4%
Llp1 113 23.0%
Bhx1 116 23.6%
Llp2 122 24.8%
Lpl2 123 25.0%
Klk1 128 26.0%
Lpl1 134 27.2%
Klk2 135 27.4%
Kkl2 181 36.8%
Kkl1 199 40.4%
Dhc1 233 47.4%
Dhc2 239 48.6%
Dch2 266 54.1%
Mhc2 281 57.1%
Dch1 285 57.9%
Mhc1 288 58.5%
Llk2 292 59.3%
Llk1 301 61.2%
Mch2 303 61.6%
Npk2 303 61.6%
Mch1 306 62.2%
Lkl2 313 63.6%
Ipv1 326 66.3%
Ipv2 333 67.7%
Npk1 336 68.3%
Lpv1 337 68.5%
Nkp2 341 69.3%
Lpv2 343 69.7%
Ivp2 352 71.5%
Lvp2 353 71.7%
Lkl1 357 72.6%
Mc1 357 72.6%
Nkp1 357 72.6%
Ivp1 360 73.2%
Mc2 360 73.2%
Lvp1 362 73.6%
Exc2 363 73.8%
Dxc2 366 74.4%
Mc3 366 74.4%
Ecx2 371 75.4%
Dxc1 372 75.6%
Ecx1 373 75.8%
Exc1 379 77.0%
Chv1* 384 78.0%
Atk1 386 78.5%
Mc4 386 78.5%
Atk2 390 79.3%
Dcx2 390 79.3%
Chv2* 400 81.3%
Bsv2 404 82.1%
Akt1 405 82.3%
Dcx1 406 82.5%
Bx1 407 82.7%
Isk2 410 83.3%
Akt2 413 83.9%
Ex1 413 83.9%
Iks2 413 83.9%
Jx1 414 84.1%
Elc2 415 84.3%
Icp2 415 84.3%
Iks1 415 84.3%
Ixp2 416 84.6%
Cvh1* 417 84.8%
Gx2 417 84.8%
Ipc2 417 84.8%
Isk1 418 85.0%
Gx1 419 85.2%
Bv1 421 85.6%
Fsv2 422 85.8%
Bsv1 423 86.0%
Elx1 423 86.0%
Gxt2 423 86.0%
Jxk2 423 86.0%
Cvh2* 424 86.2%
Ecl2 424 86.2%
Elc1 424 86.2%
Exl2 424 86.2%
Ml1* 425 86.4%
Elx2 426 86.6%
Ex3 426 86.6%
Ipx2 426 86.6%
Ll1 426 86.6%
Bx2 427 86.8%
Ecl1 427 86.8%
Fsv1 427 86.8%
Bvs2 428 87.0%
Lk2 428 87.0%
Fvs2 429 87.2%
Ll2 429 87.2%
Dx1 430 87.4%
Fvs1 430 87.4%
Gtx2 430 87.4%
Jk1 430 87.4%
Jxk1 430 87.4%
Mh2 430 87.4%
Mhl2* 430 87.4%
Mlh2* 430 87.4%
Icp1 431 87.6%
Mh1 431 87.6%
Bh1 432 87.8%
Dh1 432 87.8%
Ip1 432 87.8%
Ip2 432 87.8%
Ixp1 432 87.8%
Jkx1 432 87.8%
Jkx2 432 87.8%
Ll3 432 87.8%
Mlc1* 432 87.8%
Bv2 433 88.0%
Ex4 433 88.0%
Fv1 433 88.0%
Jx2 433 88.0%
Mcl2* 433 88.0%
Mhl1* 433 88.0%
Ch1 434 88.2%
Ipc1 434 88.2%
Lk1 434 88.2%
Np1 434 88.2%
Bvs1 435 88.4%
Exl1 435 88.4%
Ic1 435 88.4%
Is1 435 88.4%
Ix1 435 88.4%
Ll4 435 88.4%
Mlc2* 435 88.4%
Ak2 436 88.6%
Lv2 436 88.6%
El3 438 89.0%
Ip6 438 89.0%
Ipx1 438 89.0%
Bs1 439 89.2%
Lp1 439 89.2%
Dx2 440 89.4%
El2 440 89.4%
Ip4 440 89.4%
Lv1 440 89.4%
Mcl1* 440 89.4%
Mh3 440 89.4%
Mlh1* 440 89.4%
Ch2 441 89.6%
Cv1* 441 89.6%
El4 441 89.6%
Ak1 442 89.8%
Gxt1 442 89.8%
Ht2 442 89.8%
Ip5 442 89.8%
It2 442 89.8%
Lp2 442 89.8%
Ml3* 442 89.8%
Ex2 443 90.0%
Ic2 443 90.0%
Ip3 443 90.0%
Iv2 443 90.0%
Kk1 443 90.0%
Nk1 443 90.0%
At1 444 90.2%
Gtx1 444 90.2%
Is2 444 90.2%
Ix2 444 90.2%
Bs2 445 90.4%
Dc3 445 90.4%
El1 445 90.4%
Cv2* 446 90.7%
Fs1 446 90.7%
Ht1 446 90.7%
Ik2 446 90.7%
Dc4 447 90.9%
It1 447 90.9%
Iv1 447 90.9%
Jk2 447 90.9%
Lp3 448 91.1%
Mh4 448 91.1%
Ml2* 448 91.1%
Dc1 449 91.3%
Dh2 449 91.3%
Ik1 449 91.3%
Is4 449 91.3%
Bh2 450 91.5%
Ec1 450 91.5%
Ec2 450 91.5%
Gt1 450 91.5%
Lp4 450 91.5%
Fs2 451 91.7%
Nk2 451 91.7%
Ec4 452 91.9%
Is3 452 91.9%
Ml4* 453 92.1%
Gt2 454 92.3%
Hs1* 454 92.3%
Ec3 455 92.5%
Fv2 455 92.5%
Dc2 456 92.7%
At2 457 92.9%
Kl2 457 92.9%
Np2 459 93.3%
Kk2 461 93.7%
Kl1 461 93.7%
Hs2* 463 94.1%
**Figure S17.** Endorsement frequencies by item (items ordered by % correct responses). Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS17'/>

Figure S17. Endorsement frequencies by item (items ordered by % correct responses). Figure generated using R version 4.3.3 (2024-02-29)

2.4.1.2 Correlations between items

**Figure S18.** Correlation matrix between items. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS18'/>

Figure S18. Correlation matrix between items. Figure generated using R version 4.3.3 (2024-02-29)

It can be seen that the tetrachoric correlations between the items are rather low, varying between -0.55 and 0.69, with a mean of 0.24, a median of 0.3, sd of 0.25 and IQR of 0.31:
**Figure S19.** Histogram of the tetracoric correlations between different itmes. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS19'/>

Figure S19. Histogram of the tetracoric correlations between different itmes. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S20.** Hierarchical clustering of the items using 1 - tetrachoric correlations. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS20'/>

Figure S20. Hierarchical clustering of the items using 1 - tetrachoric correlations. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S21.** Mean tetrachoric correlation with the other items. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS21'/>

Figure S21. Mean tetrachoric correlation with the other items. Figure generated using R version 4.3.3 (2024-02-29)

2.4.1.3 “Weird” items

It can be seen that the following items seem “weird”:

Table S8. ‘Weird’ items that have an average negative correlation with the other items.
Item (short name) mean tetra. corr. # correct responses % correct responses
Bhx1 -0.17 116 23.6%
Bhx2 -0.16 110 22.4%
Bxh1 -0.24 104 21.1%
Bxh2 -0.14 99 20.1%
Dhc1 -0.01 233 47.4%
Hst1* -0.20 77 15.7%
Hst2* -0.17 74 15.0%
Hts1* -0.18 97 19.7%
Hts2* -0.17 80 16.3%
Ist1 -0.21 90 18.3%
Ist2 -0.20 89 18.1%
Its1 -0.17 88 17.9%
Its2 -0.25 68 13.8%
Kkl2 -0.03 181 36.8%
Klk1 -0.08 128 26.0%
Klk2 -0.08 135 27.4%
Llp1 -0.13 113 23.0%
Llp2 -0.03 122 24.8%
Lpl1 -0.10 134 27.2%
Lpl2 -0.11 123 25.0%

Interestingly, they seem to form meaningful groups:

  • the 4 “different” items involving segment B (“koi”) and tones h (33) and x (212) in both orders and both presentations; they also have very low % correct responses (around 22%) and also have correlations between them (tetrachoric rho’s between 0.28 and 0.39, with mean = 0.33 and sd = 0.04) → this suggests that tones h and x are very hard to distinguish when paired with segments B;

  • the 4 “different” items involving segment H (“sɐm”) and tones s (35) and t (335) in both orders and both presentations; they also have very low % correct responses (around 17%) and also have correlations between them (tetrachoric rho’s between 0.29 and 0.44, with mean = 0.38 and sd = 0.05) → this suggests that tones s and t are very hard to distinguish when paired with segments H;

  • the 4 “different” items involving segment I (“sem”) and tones s (35) and t (335) in both orders and both presentations; they also have very low % correct responses (around 17%) and also have correlations between them (tetrachoric rho’s between 0.21 and 0.36, with mean = 0.29 and sd = 0.05) → this suggests that tones s and t are very hard to distinguish when paired with segments I;

  • the 4 “different” items involving segment K (“səu”) and tones k (332) and l (52) in both orders and both presentations (please note that technically Kkl1 has a very small positive average correlation); they also have very low % correct responses (around 30%) and also have correlations between them (tetrachoric rho’s between 0.2 and 0.41, with mean = 0.29 and sd = 0.08) → this suggests that tones k and l are very hard to distinguish when paired with segments K;

  • the 4 “different” items involving segment L (“som”) and tones l (52) and p (452) in both orders and both presentations; they also have very low % correct responses (around 25%) and also have correlations between them (tetrachoric rho’s between 0.23 and 0.41, with mean = 0.3 and sd = 0.08) → this suggests that tones l and p are very hard to distinguish when paired with segments L;

  • Dhc1 (involving stimuli pju33 and pju23) basically has an average correlation of 0.0, while Dhc2 and Dch1 have a very small positive correlation, but Dch2 has a large positive correlation, suggesting that this may be a different case;

  • there are only 4 real words involved in these “special” items: sɐm35 for Hst1, sɐm35 for Hst2, sɐm35 for Hts1, sɐm35 for Hts2.

Moreover:

  • there is only one “same” item (Mc4, i.e, the 4th presentation of stimulus teu23) that has a low positive average correlation;

  • the other “different” stimuli involving B (“koi”) (i.e., Bsv1, Bsv2, Bvs1, Bvs2) have around 86% correct responses, and also have tetrachoric correlations between them between 0.41 and 0.59, with mean = 0.49 and sd = 0.06;

  • there are no other “different” stimuli involving H (“sɐm”);

  • the other “different” stimuli involving I (“sem”) (i.e., Icp1, Icp2, Iks1, Iks2, Ipc1, Ipc2, Ipv1, Ipv2, Ipx1, Ipx2, Isk1, Isk2, Ivp1, Ivp2, Ixp1, Ixp2) have around 82% correct responses, and also have tetrachoric correlations between them between 0.17 and 0.62, with mean = 0.42 and sd = 0.08;

  • there are no other “different” stimuli involving K (“səu”);

  • the other “different” stimuli involving L (“som”) (i.e., Lkl1, Lkl2, Llk1, Llk2, Lpv1, Lpv2, Lvp1, Lvp2) have around 68% correct responses, and also have tetrachoric correlations between them between 0.21 and 0.47, with mean = 0.36 and sd = 0.06;

  • there are no other “different” stimuli involving tones h (33) and x (212);

  • there are no other “different” stimuli involving tones s (335) and x (212);

  • the other “different” stimuli involving tones k (332) and l (52) (i.e., Lkl1, Lkl2, Llk1, Llk2) have around 64% correct responses, and also have tetrachoric correlations between them between 0.32 and 0.4, with mean = 0.36 and sd = 0.03;

  • there are no other “different” stimuli involving tones l (52) and p (452).

Taken together, these suggest that the four classes of “different” stimuli listed above (i.e., the permutations of Bhx, Hst, Ist, Kkl and Llp) tend to be massively misinterpreted by the participants (resulting in “incorrect” responses). However, we are missing here crucial items involving the H and K segments and other tone pairs, and the hx, sx and lp pairs of tones involving other segments, to be able to speculate if this is related to the particular combination of segments and tone pairs, or to the segments and/or the tone pairs themselves. It remains an interesting question of why these “different” items, representing 10 or 19.2%, behave differently from the other “different” and from virtually all the “same” items. Apparently, these “incorrect” responses persisted even when some of the participants were provided with explicit feedback by the experimenters (Manxiang Wu, p.c.) suggesting that these perceptions are “real” and not due to inattention or fatigue. Moreover, please note that Ist and Hst are specifically marked as “difficult” by the task creators, suggesting it is the tone pair st that is indeed hard to perceive as being different.

It is important to note that there is no tendency for the weird items to contain real words in the language (the table below shows percentages of items that have the corresponding properties):

is.weird FALSE TRUE
is.same is.word
FALSE FALSE 39.1 8.7
TRUE 6.5 2.2
TRUE FALSE 40.2 0
TRUE 3.3 0

    Fisher's Exact Test for Count Data

data:  table(d_all_items[!d_all_items$is.same, c("is.word", "is.weird")])
p-value = 0.6415
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
  0.1249375 10.6175357
sample estimates:
odds ratio 
  1.487323 

2.4.1.4 Does the repetition matter?

While most “same” and all “different” items are presented twice, there are some “same” items that are presented more times: Dc (4), Ec (4), El (4), Ex (4), Ip (6), Is (4), Ll (4), Lp (4), Mc (4), Mh (4), Ml (4), with the maximum number of presentations being 6. So, the question is: “do later presentations differ from the earlier ones?”.

**Figure S22.** Successive presentations for the 'same' items: % correct and tetrachoric correlation between the current and the previous presentation. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS22'/>

Figure S22. Successive presentations for the ‘same’ items: % correct and tetrachoric correlation between the current and the previous presentation. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S23.** Successive presentations for the 'different' items: % correct and tetrachoric correlation between the current and the previous presentation. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS23'/>

Figure S23. Successive presentations for the ‘different’ items: % correct and tetrachoric correlation between the current and the previous presentation. Figure generated using R version 4.3.3 (2024-02-29)

We fitted a beta regression model (using glmmTMB) of the % correct responses on the interaction between presentation number (between 1 and a maximum of 6, varying by item) and the item type (“same” or “different”). Manual model simplification shows that the interaction and the presentation number do not matter (χ2(2)=0.4, p=0.839, ΔAIC=-3.6), but item type does (χ2(3)=56.5, p=3.21×10-12, ΔAIC=50.5), with the “same” items having an overall higher % of correct responses than the “different” items (Δ%correct = 21.0%, p=2.51×10-15). (There is some overdispersion in the model, 1.32, p=0.016, but probably not sufficient to qualitatively change these results.) The same lack of a significant effect for presentation number is shown separately for the ‘same’ items only (p=0.366) and for the ‘different’ items only (p=0.698).

Likewise, it seems the order of the tones for the ‘different’ items does not seem to matter (one-sample t-test against 0 of the differences between the two orders across the items is t(25)=1.5, p=0.156).

Thus, there seems to be no systematic effects of successive presentations of the same item, or of the order of the two tones for the ‘different’ items, on the percent of correct answers.

2.4.2 Item properties: Mokken Scaling Analysis (MSA)

**Figure S24.** MSA: histogram of the number of Guttman errors (gPlus) across all items. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS24'/>

Figure S24. MSA: histogram of the number of Guttman errors (gPlus) across all items. Figure generated using R version 4.3.3 (2024-02-29)

There were 73 cases (out of 492, i.e. 14.8%) with a number of Guttman errors bigger than (Q3 + 1.5*IQR) = 3184.75.

Table S9. MSA: Homogeneity values (with standard errors and 95%CIs) for all items, sorted by item name.
Items Item H se 95% ci
Ak1 0.204 (0.028) [0.150, 0.259]
Ak2 0.216 (0.026) [0.166, 0.267]
Akt1 0.252 (0.027) [0.200, 0.305]
Akt2 0.257 (0.027) [0.204, 0.309]
At1 0.253 (0.024) [0.205, 0.301]
At2 0.272 (0.025) [0.223, 0.320]
Atk1 0.224 (0.027) [0.171, 0.277]
Atk2 0.241 (0.026) [0.190, 0.293]
Bh1 0.241 (0.023) [0.197, 0.286]
Bh2 0.236 (0.026) [0.185, 0.286]
Bhx1 -0.270 (0.057) [-0.382, -0.157]
Bhx2 -0.263 (0.060) [-0.379, -0.146]
Bs1 0.270 (0.021) [0.229, 0.311]
Bs2 0.270 (0.022) [0.227, 0.313]
Bsv1 0.247 (0.026) [0.195, 0.298]
Bsv2 0.242 (0.028) [0.188, 0.296]
Bv1 0.215 (0.026) [0.164, 0.265]
Bv2 0.219 (0.026) [0.169, 0.269]
Bvs1 0.244 (0.024) [0.196, 0.292]
Bvs2 0.261 (0.024) [0.214, 0.309]
Bx1 0.207 (0.027) [0.154, 0.260]
Bx2 0.274 (0.023) [0.228, 0.320]
Bxh1 -0.432 (0.062) [-0.554, -0.310]
Bxh2 -0.251 (0.064) [-0.377, -0.125]
Ch1 0.216 (0.024) [0.169, 0.263]
Ch2 0.285 (0.021) [0.244, 0.326]
Chv1 0.178 (0.028) [0.124, 0.232]
Chv2 0.183 (0.027) [0.129, 0.236]
Cv1 0.201 (0.025) [0.153, 0.250]
Cv2 0.224 (0.027) [0.171, 0.277]
Cvh1 0.235 (0.027) [0.182, 0.287]
Cvh2 0.232 (0.026) [0.181, 0.283]
Dc1 0.226 (0.024) [0.179, 0.272]
Dc2 0.241 (0.026) [0.190, 0.292]
Dc3 0.274 (0.023) [0.228, 0.320]
Dc4 0.235 (0.024) [0.188, 0.282]
Dch1 0.106 (0.032) [0.043, 0.170]
Dch2 0.212 (0.033) [0.148, 0.275]
Dcx1 0.202 (0.027) [0.149, 0.255]
Dcx2 0.197 (0.027) [0.144, 0.250]
Dh1 0.200 (0.026) [0.150, 0.251]
Dh2 0.159 (0.029) [0.102, 0.217]
Dhc1 0.021 (0.038) [-0.053, 0.095]
Dhc2 0.087 (0.037) [0.014, 0.159]
Dx1 0.206 (0.025) [0.157, 0.254]
Dx2 0.220 (0.025) [0.170, 0.269]
Dxc1 0.238 (0.027) [0.184, 0.291]
Dxc2 0.219 (0.028) [0.164, 0.273]
Ec1 0.232 (0.024) [0.184, 0.280]
Ec2 0.242 (0.026) [0.192, 0.293]
Ec3 0.225 (0.028) [0.171, 0.279]
Ec4 0.258 (0.024) [0.212, 0.304]
Ecl1 0.263 (0.025) [0.213, 0.312]
Ecl2 0.260 (0.025) [0.211, 0.309]
Ecx1 0.197 (0.028) [0.143, 0.252]
Ecx2 0.224 (0.028) [0.170, 0.278]
El1 0.193 (0.028) [0.138, 0.247]
El2 0.217 (0.025) [0.169, 0.266]
El3 0.207 (0.027) [0.154, 0.260]
El4 0.212 (0.026) [0.160, 0.263]
Elc1 0.290 (0.025) [0.242, 0.338]
Elc2 0.260 (0.027) [0.208, 0.312]
Elx1 0.290 (0.025) [0.242, 0.339]
Elx2 0.294 (0.025) [0.246, 0.343]
Ex1 0.227 (0.026) [0.175, 0.278]
Ex2 0.184 (0.025) [0.135, 0.232]
Ex3 0.227 (0.025) [0.177, 0.276]
Ex4 0.210 (0.025) [0.162, 0.259]
Exc1 0.178 (0.028) [0.123, 0.233]
Exc2 0.191 (0.028) [0.137, 0.246]
Exl1 0.282 (0.022) [0.239, 0.326]
Exl2 0.313 (0.024) [0.266, 0.360]
Fs1 0.267 (0.022) [0.223, 0.310]
Fs2 0.298 (0.020) [0.258, 0.337]
Fsv1 0.257 (0.025) [0.207, 0.306]
Fsv2 0.272 (0.026) [0.221, 0.323]
Fv1 0.260 (0.023) [0.214, 0.305]
Fv2 0.267 (0.027) [0.214, 0.319]
Fvs1 0.285 (0.023) [0.239, 0.331]
Fvs2 0.243 (0.025) [0.194, 0.292]
Gt1 0.236 (0.024) [0.188, 0.284]
Gt2 0.272 (0.024) [0.224, 0.319]
Gtx1 0.292 (0.019) [0.256, 0.329]
Gtx2 0.278 (0.024) [0.230, 0.326]
Gx1 0.172 (0.027) [0.120, 0.225]
Gx2 0.220 (0.027) [0.168, 0.272]
Gxt1 0.265 (0.022) [0.221, 0.309]
Gxt2 0.273 (0.025) [0.224, 0.323]
Hs1 0.241 (0.026) [0.189, 0.292]
Hs2 0.269 (0.028) [0.215, 0.324]
Hst1 -0.378 (0.076) [-0.528, -0.228]
Hst2 -0.312 (0.077) [-0.463, -0.162]
Ht1 0.235 (0.025) [0.187, 0.284]
Ht2 0.244 (0.025) [0.194, 0.294]
Hts1 -0.303 (0.067) [-0.434, -0.171]
Hts2 -0.318 (0.074) [-0.463, -0.173]
Ic1 0.205 (0.026) [0.155, 0.255]
Ic2 0.264 (0.024) [0.217, 0.311]
Icp1 0.191 (0.026) [0.139, 0.242]
Icp2 0.246 (0.027) [0.193, 0.298]
Ik1 0.222 (0.025) [0.172, 0.272]
Ik2 0.232 (0.027) [0.180, 0.284]
Iks1 0.204 (0.027) [0.151, 0.257]
Iks2 0.255 (0.027) [0.203, 0.308]
Ip1 0.143 (0.026) [0.093, 0.193]
Ip2 0.225 (0.024) [0.178, 0.273]
Ip3 0.143 (0.028) [0.089, 0.197]
Ip4 0.226 (0.026) [0.176, 0.276]
Ip5 0.214 (0.026) [0.164, 0.264]
Ip6 0.224 (0.025) [0.175, 0.274]
Ipc1 0.254 (0.024) [0.208, 0.301]
Ipc2 0.277 (0.026) [0.227, 0.327]
Ipv1 0.227 (0.029) [0.170, 0.284]
Ipv2 0.233 (0.029) [0.176, 0.289]
Ipx1 0.256 (0.023) [0.211, 0.301]
Ipx2 0.268 (0.025) [0.219, 0.317]
Is1 0.265 (0.022) [0.222, 0.308]
Is2 0.253 (0.023) [0.208, 0.299]
Is3 0.279 (0.023) [0.235, 0.324]
Is4 0.270 (0.025) [0.221, 0.319]
Isk1 0.223 (0.026) [0.171, 0.274]
Isk2 0.239 (0.026) [0.187, 0.290]
Ist1 -0.385 (0.071) [-0.523, -0.246]
Ist2 -0.362 (0.070) [-0.500, -0.224]
It1 0.224 (0.026) [0.174, 0.275]
It2 0.207 (0.027) [0.155, 0.259]
Its1 -0.304 (0.070) [-0.441, -0.167]
Its2 -0.539 (0.084) [-0.703, -0.374]
Iv1 0.276 (0.022) [0.232, 0.319]
Iv2 0.224 (0.024) [0.177, 0.271]
Ivp1 0.216 (0.028) [0.162, 0.270]
Ivp2 0.213 (0.028) [0.158, 0.269]
Ix1 0.191 (0.025) [0.142, 0.239]
Ix2 0.236 (0.024) [0.188, 0.284]
Ixp1 0.270 (0.023) [0.224, 0.316]
Ixp2 0.253 (0.027) [0.201, 0.305]
Jk1 0.193 (0.025) [0.145, 0.242]
Jk2 0.241 (0.026) [0.191, 0.292]
Jkx1 0.273 (0.024) [0.226, 0.320]
Jkx2 0.254 (0.024) [0.207, 0.302]
Jx1 0.222 (0.027) [0.169, 0.275]
Jx2 0.202 (0.027) [0.149, 0.256]
Jxk1 0.240 (0.026) [0.189, 0.291]
Jxk2 0.236 (0.026) [0.185, 0.287]
Kk1 0.251 (0.023) [0.205, 0.296]
Kk2 0.285 (0.023) [0.240, 0.329]
Kkl1 0.121 (0.040) [0.042, 0.199]
Kkl2 -0.018 (0.045) [-0.106, 0.070]
Kl1 0.287 (0.028) [0.231, 0.343]
Kl2 0.295 (0.023) [0.250, 0.340]
Klk1 -0.092 (0.057) [-0.203, 0.020]
Klk2 -0.103 (0.054) [-0.210, 0.004]
Lk1 0.245 (0.024) [0.198, 0.292]
Lk2 0.244 (0.026) [0.194, 0.295]
Lkl1 0.204 (0.028) [0.149, 0.259]
Lkl2 0.187 (0.030) [0.128, 0.246]
Ll1 0.252 (0.026) [0.201, 0.303]
Ll2 0.203 (0.026) [0.152, 0.254]
Ll3 0.232 (0.025) [0.183, 0.280]
Ll4 0.185 (0.027) [0.131, 0.239]
Llk1 0.150 (0.031) [0.089, 0.211]
Llk2 0.171 (0.032) [0.110, 0.233]
Llp1 -0.202 (0.061) [-0.321, -0.083]
Llp2 -0.028 (0.056) [-0.138, 0.083]
Lp1 0.205 (0.026) [0.155, 0.255]
Lp2 0.227 (0.026) [0.176, 0.278]
Lp3 0.244 (0.024) [0.198, 0.291]
Lp4 0.267 (0.026) [0.217, 0.317]
Lpl1 -0.129 (0.054) [-0.236, -0.023]
Lpl2 -0.163 (0.058) [-0.277, -0.050]
Lpv1 0.189 (0.029) [0.133, 0.246]
Lpv2 0.200 (0.029) [0.144, 0.256]
Lv1 0.200 (0.026) [0.149, 0.252]
Lv2 0.211 (0.025) [0.162, 0.259]
Lvp1 0.192 (0.028) [0.138, 0.247]
Lvp2 0.198 (0.028) [0.143, 0.253]
Mc1 0.095 (0.028) [0.039, 0.151]
Mc2 0.062 (0.029) [0.005, 0.119]
Mc3 0.101 (0.028) [0.046, 0.157]
Mc4 0.002 (0.026) [-0.049, 0.054]
Mch1 0.112 (0.031) [0.051, 0.173]
Mch2 0.187 (0.031) [0.126, 0.247]
Mcl1 0.280 (0.022) [0.237, 0.323]
Mcl2 0.275 (0.024) [0.229, 0.321]
Mh1 0.173 (0.026) [0.122, 0.224]
Mh2 0.211 (0.025) [0.161, 0.260]
Mh3 0.215 (0.024) [0.169, 0.261]
Mh4 0.207 (0.026) [0.157, 0.258]
Mhc1 0.147 (0.032) [0.084, 0.209]
Mhc2 0.188 (0.032) [0.126, 0.251]
Mhl1 0.255 (0.023) [0.210, 0.301]
Mhl2 0.293 (0.024) [0.247, 0.339]
Ml1 0.238 (0.025) [0.189, 0.287]
Ml2 0.209 (0.026) [0.158, 0.261]
Ml3 0.202 (0.026) [0.152, 0.253]
Ml4 0.163 (0.031) [0.103, 0.224]
Mlc1 0.294 (0.023) [0.249, 0.339]
Mlc2 0.270 (0.022) [0.228, 0.313]
Mlh1 0.256 (0.023) [0.211, 0.301]
Mlh2 0.300 (0.023) [0.255, 0.344]
Nk1 0.233 (0.023) [0.188, 0.279]
Nk2 0.245 (0.025) [0.196, 0.293]
Nkp1 0.200 (0.028) [0.145, 0.255]
Nkp2 0.217 (0.028) [0.161, 0.273]
Np1 0.241 (0.024) [0.194, 0.288]
Np2 0.229 (0.033) [0.165, 0.293]
Npk1 0.174 (0.029) [0.117, 0.231]
Npk2 0.190 (0.030) [0.131, 0.250]
Table S10. Same table as bove but sorted by homogeneity.
Items Item H se 95% ci
Exl2 0.313 (0.024) [0.266, 0.360]
Mlh2 0.300 (0.023) [0.255, 0.344]
Fs2 0.298 (0.020) [0.258, 0.337]
Kl2 0.295 (0.023) [0.250, 0.340]
Elx2 0.294 (0.025) [0.246, 0.343]
Mlc1 0.294 (0.023) [0.249, 0.339]
Mhl2 0.293 (0.024) [0.247, 0.339]
Gtx1 0.292 (0.019) [0.256, 0.329]
Elc1 0.290 (0.025) [0.242, 0.338]
Elx1 0.290 (0.025) [0.242, 0.339]
Kl1 0.287 (0.028) [0.231, 0.343]
Ch2 0.285 (0.021) [0.244, 0.326]
Fvs1 0.285 (0.023) [0.239, 0.331]
Kk2 0.285 (0.023) [0.240, 0.329]
Exl1 0.282 (0.022) [0.239, 0.326]
Mcl1 0.280 (0.022) [0.237, 0.323]
Is3 0.279 (0.023) [0.235, 0.324]
Gtx2 0.278 (0.024) [0.230, 0.326]
Ipc2 0.277 (0.026) [0.227, 0.327]
Iv1 0.276 (0.022) [0.232, 0.319]
Mcl2 0.275 (0.024) [0.229, 0.321]
Bx2 0.274 (0.023) [0.228, 0.320]
Dc3 0.274 (0.023) [0.228, 0.320]
Gxt2 0.273 (0.025) [0.224, 0.323]
Jkx1 0.273 (0.024) [0.226, 0.320]
At2 0.272 (0.025) [0.223, 0.320]
Fsv2 0.272 (0.026) [0.221, 0.323]
Gt2 0.272 (0.024) [0.224, 0.319]
Bs1 0.270 (0.021) [0.229, 0.311]
Bs2 0.270 (0.022) [0.227, 0.313]
Is4 0.270 (0.025) [0.221, 0.319]
Ixp1 0.270 (0.023) [0.224, 0.316]
Mlc2 0.270 (0.022) [0.228, 0.313]
Hs2 0.269 (0.028) [0.215, 0.324]
Ipx2 0.268 (0.025) [0.219, 0.317]
Fs1 0.267 (0.022) [0.223, 0.310]
Fv2 0.267 (0.027) [0.214, 0.319]
Lp4 0.267 (0.026) [0.217, 0.317]
Gxt1 0.265 (0.022) [0.221, 0.309]
Is1 0.265 (0.022) [0.222, 0.308]
Ic2 0.264 (0.024) [0.217, 0.311]
Ecl1 0.263 (0.025) [0.213, 0.312]
Bvs2 0.261 (0.024) [0.214, 0.309]
Ecl2 0.260 (0.025) [0.211, 0.309]
Elc2 0.260 (0.027) [0.208, 0.312]
Fv1 0.260 (0.023) [0.214, 0.305]
Ec4 0.258 (0.024) [0.212, 0.304]
Akt2 0.257 (0.027) [0.204, 0.309]
Fsv1 0.257 (0.025) [0.207, 0.306]
Ipx1 0.256 (0.023) [0.211, 0.301]
Mlh1 0.256 (0.023) [0.211, 0.301]
Iks2 0.255 (0.027) [0.203, 0.308]
Mhl1 0.255 (0.023) [0.210, 0.301]
Ipc1 0.254 (0.024) [0.208, 0.301]
Jkx2 0.254 (0.024) [0.207, 0.302]
At1 0.253 (0.024) [0.205, 0.301]
Is2 0.253 (0.023) [0.208, 0.299]
Ixp2 0.253 (0.027) [0.201, 0.305]
Akt1 0.252 (0.027) [0.200, 0.305]
Ll1 0.252 (0.026) [0.201, 0.303]
Kk1 0.251 (0.023) [0.205, 0.296]
Bsv1 0.247 (0.026) [0.195, 0.298]
Icp2 0.246 (0.027) [0.193, 0.298]
Lk1 0.245 (0.024) [0.198, 0.292]
Nk2 0.245 (0.025) [0.196, 0.293]
Bvs1 0.244 (0.024) [0.196, 0.292]
Ht2 0.244 (0.025) [0.194, 0.294]
Lk2 0.244 (0.026) [0.194, 0.295]
Lp3 0.244 (0.024) [0.198, 0.291]
Fvs2 0.243 (0.025) [0.194, 0.292]
Bsv2 0.242 (0.028) [0.188, 0.296]
Ec2 0.242 (0.026) [0.192, 0.293]
Atk2 0.241 (0.026) [0.190, 0.293]
Bh1 0.241 (0.023) [0.197, 0.286]
Dc2 0.241 (0.026) [0.190, 0.292]
Hs1 0.241 (0.026) [0.189, 0.292]
Jk2 0.241 (0.026) [0.191, 0.292]
Np1 0.241 (0.024) [0.194, 0.288]
Jxk1 0.240 (0.026) [0.189, 0.291]
Isk2 0.239 (0.026) [0.187, 0.290]
Dxc1 0.238 (0.027) [0.184, 0.291]
Ml1 0.238 (0.025) [0.189, 0.287]
Bh2 0.236 (0.026) [0.185, 0.286]
Gt1 0.236 (0.024) [0.188, 0.284]
Ix2 0.236 (0.024) [0.188, 0.284]
Jxk2 0.236 (0.026) [0.185, 0.287]
Cvh1 0.235 (0.027) [0.182, 0.287]
Dc4 0.235 (0.024) [0.188, 0.282]
Ht1 0.235 (0.025) [0.187, 0.284]
Ipv2 0.233 (0.029) [0.176, 0.289]
Nk1 0.233 (0.023) [0.188, 0.279]
Cvh2 0.232 (0.026) [0.181, 0.283]
Ec1 0.232 (0.024) [0.184, 0.280]
Ik2 0.232 (0.027) [0.180, 0.284]
Ll3 0.232 (0.025) [0.183, 0.280]
Np2 0.229 (0.033) [0.165, 0.293]
Ex1 0.227 (0.026) [0.175, 0.278]
Ex3 0.227 (0.025) [0.177, 0.276]
Ipv1 0.227 (0.029) [0.170, 0.284]
Lp2 0.227 (0.026) [0.176, 0.278]
Dc1 0.226 (0.024) [0.179, 0.272]
Ip4 0.226 (0.026) [0.176, 0.276]
Ec3 0.225 (0.028) [0.171, 0.279]
Ip2 0.225 (0.024) [0.178, 0.273]
Atk1 0.224 (0.027) [0.171, 0.277]
Cv2 0.224 (0.027) [0.171, 0.277]
Ecx2 0.224 (0.028) [0.170, 0.278]
Ip6 0.224 (0.025) [0.175, 0.274]
It1 0.224 (0.026) [0.174, 0.275]
Iv2 0.224 (0.024) [0.177, 0.271]
Isk1 0.223 (0.026) [0.171, 0.274]
Ik1 0.222 (0.025) [0.172, 0.272]
Jx1 0.222 (0.027) [0.169, 0.275]
Dx2 0.220 (0.025) [0.170, 0.269]
Gx2 0.220 (0.027) [0.168, 0.272]
Bv2 0.219 (0.026) [0.169, 0.269]
Dxc2 0.219 (0.028) [0.164, 0.273]
El2 0.217 (0.025) [0.169, 0.266]
Nkp2 0.217 (0.028) [0.161, 0.273]
Ak2 0.216 (0.026) [0.166, 0.267]
Ch1 0.216 (0.024) [0.169, 0.263]
Ivp1 0.216 (0.028) [0.162, 0.270]
Bv1 0.215 (0.026) [0.164, 0.265]
Mh3 0.215 (0.024) [0.169, 0.261]
Ip5 0.214 (0.026) [0.164, 0.264]
Ivp2 0.213 (0.028) [0.158, 0.269]
Dch2 0.212 (0.033) [0.148, 0.275]
El4 0.212 (0.026) [0.160, 0.263]
Lv2 0.211 (0.025) [0.162, 0.259]
Mh2 0.211 (0.025) [0.161, 0.260]
Ex4 0.210 (0.025) [0.162, 0.259]
Ml2 0.209 (0.026) [0.158, 0.261]
Bx1 0.207 (0.027) [0.154, 0.260]
El3 0.207 (0.027) [0.154, 0.260]
It2 0.207 (0.027) [0.155, 0.259]
Mh4 0.207 (0.026) [0.157, 0.258]
Dx1 0.206 (0.025) [0.157, 0.254]
Ic1 0.205 (0.026) [0.155, 0.255]
Lp1 0.205 (0.026) [0.155, 0.255]
Ak1 0.204 (0.028) [0.150, 0.259]
Iks1 0.204 (0.027) [0.151, 0.257]
Lkl1 0.204 (0.028) [0.149, 0.259]
Ll2 0.203 (0.026) [0.152, 0.254]
Dcx1 0.202 (0.027) [0.149, 0.255]
Jx2 0.202 (0.027) [0.149, 0.256]
Ml3 0.202 (0.026) [0.152, 0.253]
Cv1 0.201 (0.025) [0.153, 0.250]
Dh1 0.200 (0.026) [0.150, 0.251]
Lpv2 0.200 (0.029) [0.144, 0.256]
Lv1 0.200 (0.026) [0.149, 0.252]
Nkp1 0.200 (0.028) [0.145, 0.255]
Lvp2 0.198 (0.028) [0.143, 0.253]
Dcx2 0.197 (0.027) [0.144, 0.250]
Ecx1 0.197 (0.028) [0.143, 0.252]
El1 0.193 (0.028) [0.138, 0.247]
Jk1 0.193 (0.025) [0.145, 0.242]
Lvp1 0.192 (0.028) [0.138, 0.247]
Exc2 0.191 (0.028) [0.137, 0.246]
Icp1 0.191 (0.026) [0.139, 0.242]
Ix1 0.191 (0.025) [0.142, 0.239]
Npk2 0.190 (0.030) [0.131, 0.250]
Lpv1 0.189 (0.029) [0.133, 0.246]
Mhc2 0.188 (0.032) [0.126, 0.251]
Lkl2 0.187 (0.030) [0.128, 0.246]
Mch2 0.187 (0.031) [0.126, 0.247]
Ll4 0.185 (0.027) [0.131, 0.239]
Ex2 0.184 (0.025) [0.135, 0.232]
Chv2 0.183 (0.027) [0.129, 0.236]
Chv1 0.178 (0.028) [0.124, 0.232]
Exc1 0.178 (0.028) [0.123, 0.233]
Npk1 0.174 (0.029) [0.117, 0.231]
Mh1 0.173 (0.026) [0.122, 0.224]
Gx1 0.172 (0.027) [0.120, 0.225]
Llk2 0.171 (0.032) [0.110, 0.233]
Ml4 0.163 (0.031) [0.103, 0.224]
Dh2 0.159 (0.029) [0.102, 0.217]
Llk1 0.150 (0.031) [0.089, 0.211]
Mhc1 0.147 (0.032) [0.084, 0.209]
Ip1 0.143 (0.026) [0.093, 0.193]
Ip3 0.143 (0.028) [0.089, 0.197]
Kkl1 0.121 (0.040) [0.042, 0.199]
Mch1 0.112 (0.031) [0.051, 0.173]
Dch1 0.106 (0.032) [0.043, 0.170]
Mc3 0.101 (0.028) [0.046, 0.157]
Mc1 0.095 (0.028) [0.039, 0.151]
Dhc2 0.087 (0.037) [0.014, 0.159]
Mc2 0.062 (0.029) [0.005, 0.119]
Dhc1 0.021 (0.038) [-0.053, 0.095]
Mc4 0.002 (0.026) [-0.049, 0.054]
Kkl2 -0.018 (0.045) [-0.106, 0.070]
Llp2 -0.028 (0.056) [-0.138, 0.083]
Klk1 -0.092 (0.057) [-0.203, 0.020]
Klk2 -0.103 (0.054) [-0.210, 0.004]
Lpl1 -0.129 (0.054) [-0.236, -0.023]
Lpl2 -0.163 (0.058) [-0.277, -0.050]
Llp1 -0.202 (0.061) [-0.321, -0.083]
Bxh2 -0.251 (0.064) [-0.377, -0.125]
Bhx2 -0.263 (0.060) [-0.379, -0.146]
Bhx1 -0.270 (0.057) [-0.382, -0.157]
Hts1 -0.303 (0.067) [-0.434, -0.171]
Its1 -0.304 (0.070) [-0.441, -0.167]
Hst2 -0.312 (0.077) [-0.463, -0.162]
Hts2 -0.318 (0.074) [-0.463, -0.173]
Ist2 -0.362 (0.070) [-0.500, -0.224]
Hst1 -0.378 (0.076) [-0.528, -0.228]
Ist1 -0.385 (0.071) [-0.523, -0.246]
Bxh1 -0.432 (0.062) [-0.554, -0.310]
Its2 -0.539 (0.084) [-0.703, -0.374]

The complete item set has a homogeneity value H (se, 95%CI) of 0.201, (0.008), [0.186, 0.216]: this is significantly lower than the recommended 0.30, suggesting that the scale is not homogeneous. This is further supported by the fact that few items have a homogeneity around or above this value (only 2 if we consider the point estimate, and 59 out of 208 if we consider a 95%CI with an upper limit above 0.3). Interestingly, the homogeneity of related items (different presentations and different orders of the tones) are overall very similar, suggesting again that this an intrinsic property of the segments and tone(s) and not of their repeated presentation or of the order of tones.

Table S11. MSA: aisp for increasing H thresholds (c) for all items.
Items c=0.05 c=0.10 c=0.15 c=0.20 c=0.25 c=0.30 c=0.35 c=0.40 c=0.45 c=0.50 c=0.55 c=0.60
Ak1 1 1 1 1 1 0 25 0 0 0 0 0
Ak2 1 1 1 1 1 1 7 15 23 0 0 0
Akt1 1 1 1 1 1 2 2 12 21 29 0 0
Akt2 1 1 1 1 1 2 2 43 0 0 0 0
At1 1 1 1 1 1 1 1 0 0 0 0 0
At2 1 1 1 1 1 1 1 7 12 17 0 0
Atk1 1 1 1 1 1 2 2 38 0 0 0 0
Atk2 1 1 1 1 1 2 2 0 0 0 0 0
Bh1 1 1 1 1 1 1 1 8 0 0 0 0
Bh2 1 1 1 1 1 1 11 24 36 0 0 0
Bhx1 4 4 5 7 14 0 0 0 0 0 0 0
Bhx2 4 4 5 7 9 0 0 0 0 0 0 0
Bs1 1 1 1 1 1 1 1 1 1 0 0 0
Bs2 1 1 1 1 1 1 1 1 1 14 14 0
Bsv1 1 1 1 1 1 2 4 18 31 0 0 0
Bsv2 1 1 1 1 1 2 2 14 20 0 0 0
Bv1 1 1 1 1 1 1 5 10 15 23 0 0
Bv2 1 1 1 1 1 1 16 32 0 0 0 0
Bvs1 1 1 1 1 1 2 2 23 35 0 0 0
Bvs2 1 1 1 1 1 2 2 14 20 27 0 0
Bx1 1 1 1 1 1 1 11 24 41 0 0 0
Bx2 1 1 1 1 1 1 1 1 1 1 1 1
Bxh1 4 4 5 7 14 0 0 0 0 0 0 0
Bxh2 4 4 5 7 0 0 0 0 0 0 0 0
Ch1 1 1 1 1 1 1 5 31 0 0 0 0
Ch2 1 1 1 1 1 1 1 1 9 12 12 0
Chv1 1 1 1 1 6 12 0 0 0 0 0 0
Chv2 5 5 4 4 6 0 22 0 0 0 0 0
Cv1 1 1 1 1 1 10 28 44 0 0 0 0
Cv2 1 1 1 1 1 1 15 30 0 0 0 0
Cvh1 1 1 1 1 1 2 21 42 0 0 0 0
Cvh2 1 1 1 1 1 2 22 39 0 0 0 0
Dc1 1 1 1 1 1 10 12 0 0 0 0 0
Dc2 1 1 1 1 1 1 1 8 11 19 0 0
Dc3 1 1 1 1 1 1 1 1 28 0 0 0
Dc4 1 1 1 1 1 1 5 33 0 0 0 0
Dch1 3 3 3 8 6 13 0 0 0 0 0 0
Dch2 3 3 3 3 6 2 13 6 20 27 0 0
Dcx1 1 1 1 1 3 2 13 26 0 0 0 0
Dcx2 3 3 3 3 3 6 13 26 0 0 0 0
Dh1 1 1 1 1 1 11 0 0 39 0 0 0
Dh2 2 2 2 2 2 3 3 3 4 6 6 0
Dhc1 3 3 4 8 0 0 0 0 0 0 0 0
Dhc2 4 4 4 4 5 13 0 0 0 0 0 0
Dx1 1 1 1 1 1 1 19 37 16 24 0 0
Dx2 1 1 1 1 1 1 24 11 28 0 0 0
Dxc1 3 3 3 3 3 2 23 20 6 8 8 0
Dxc2 1 1 1 1 1 2 0 0 0 0 0 0
Ec1 1 1 1 1 1 1 11 27 0 0 0 0
Ec2 1 1 1 1 1 1 1 8 33 0 0 0
Ec3 1 1 1 1 1 1 9 4 27 0 0 0
Ec4 1 1 1 1 1 1 1 1 1 4 4 4
Ecl1 1 1 1 1 1 2 2 29 0 0 0 0
Ecl2 1 1 1 1 1 2 2 23 0 0 0 0
Ecx1 5 5 4 6 10 0 0 35 24 0 0 0
Ecx2 1 1 1 1 1 2 13 0 17 10 10 0
El1 1 1 1 1 12 11 19 37 0 0 0 0
El2 1 1 1 1 1 1 11 27 0 0 0 0
El3 1 1 1 1 1 1 0 33 0 0 0 0
El4 1 1 1 1 1 1 5 33 0 0 0 0
Elc1 1 1 1 1 1 1 2 2 7 0 0 0
Elc2 1 1 1 1 1 2 2 2 37 0 0 0
Elx1 1 1 1 1 1 1 1 2 3 5 0 0
Elx2 1 1 1 1 1 1 2 2 3 5 5 0
Ex1 1 1 1 1 1 1 1 8 13 4 4 4
Ex2 1 1 1 1 0 0 0 0 0 0 0 0
Ex3 1 1 1 1 1 1 1 1 1 2 2 2
Ex4 1 1 1 1 1 1 11 24 36 0 0 0
Exc1 3 3 3 6 10 0 0 0 0 0 0 0
Exc2 3 3 3 3 3 6 4 18 31 0 0 0
Exl1 1 1 1 1 1 2 2 2 3 28 0 0
Exl2 1 1 1 1 1 1 1 2 3 20 0 0
Fs1 1 1 1 1 1 1 1 1 16 24 0 0
Fs2 1 1 1 1 1 1 1 1 10 0 0 0
Fsv1 1 1 1 1 1 2 2 22 34 0 0 0
Fsv2 1 1 1 1 1 2 2 2 37 0 0 0
Fv1 1 1 1 1 1 1 1 7 9 17 0 0
Fv2 1 1 1 1 1 1 2 29 8 11 11 0
Fvs1 1 1 1 1 1 1 2 2 3 0 0 0
Fvs2 1 1 1 1 1 2 2 12 22 30 0 0
Gt1 1 1 1 1 1 1 5 10 39 0 0 0
Gt2 1 1 1 1 1 1 1 1 1 14 14 0
Gtx1 1 1 1 1 1 2 2 2 3 5 5 0
Gtx2 1 1 1 1 1 2 2 2 2 3 3 3
Gx1 1 1 1 1 11 0 6 13 26 31 0 0
Gx2 1 1 1 1 1 1 1 8 11 19 0 0
Gxt1 1 1 1 1 1 2 2 2 2 10 10 0
Gxt2 1 1 1 1 1 2 2 12 21 29 0 0
Hs1 1 1 1 1 1 1 6 13 13 31 0 0
Hs2 1 1 1 1 1 1 1 1 10 13 13 0
Hst1 4 4 5 5 7 0 0 0 0 0 0 0
Hst2 4 4 5 5 7 15 0 0 0 0 0 0
Ht1 1 1 1 1 1 1 12 0 0 0 0 0
Ht2 1 1 1 1 1 1 12 4 5 7 7 0
Hts1 4 4 5 5 7 0 0 0 0 0 0 0
Hts2 4 4 5 5 7 15 0 0 0 0 0 0
Ic1 1 1 1 1 1 1 25 0 33 0 0 0
Ic2 1 1 1 1 1 1 1 1 1 0 0 0
Icp1 3 3 3 3 3 0 14 28 0 0 0 0
Icp2 1 1 1 1 1 2 23 0 0 0 0 0
Ik1 1 1 1 1 1 5 0 0 0 0 0 0
Ik2 1 1 1 1 1 1 7 31 0 0 0 0
Iks1 3 3 3 3 3 8 0 0 0 0 0 0
Iks2 1 1 1 1 1 2 2 2 0 0 0 0
Ip1 1 1 1 0 0 0 0 0 0 0 0 0
Ip2 1 1 1 1 1 1 1 40 0 0 0 0
Ip3 2 2 0 0 0 0 0 0 0 0 0 0
Ip4 1 1 1 1 1 1 6 7 26 0 0 0
Ip5 1 1 1 1 1 1 9 40 0 0 0 0
Ip6 1 1 1 1 1 1 15 30 0 0 0 0
Ipc1 1 1 1 1 1 2 2 12 22 30 0 0
Ipc2 1 1 1 1 1 2 2 2 2 18 0 0
Ipv1 1 1 1 1 1 2 2 5 7 9 9 0
Ipv2 1 1 1 1 1 2 2 2 2 3 3 3
Ipx1 1 1 1 1 1 2 2 5 7 9 9 0
Ipx2 1 1 1 1 1 2 2 20 0 0 0 0
Is1 1 1 1 1 1 1 1 6 9 15 15 0
Is2 1 1 1 1 1 1 10 0 0 0 0 0
Is3 1 1 1 1 1 1 1 4 5 7 7 0
Is4 1 1 1 1 1 1 1 6 9 15 15 0
Isk1 1 1 1 1 1 12 18 36 0 0 0 0
Isk2 1 1 1 1 1 2 23 38 0 0 0 0
Ist1 4 4 0 0 0 0 0 0 0 0 0 0
Ist2 4 4 5 5 9 0 0 0 0 0 0 0
It1 1 1 1 1 1 1 7 15 23 0 0 0
It2 1 1 1 1 1 0 0 0 0 0 0 0
Its1 4 4 5 0 0 0 0 0 0 0 0 0
Its2 4 4 5 5 7 0 0 0 0 0 0 0
Iv1 1 1 1 1 1 1 1 1 30 16 16 0
Iv2 1 1 1 1 1 1 9 21 27 0 0 0
Ivp1 1 1 1 1 1 2 14 28 0 0 0 0
Ivp2 1 1 1 1 1 2 2 23 35 0 0 0
Ix1 1 1 1 1 12 0 11 0 0 0 0 0
Ix2 1 1 1 1 1 9 16 32 0 20 0 0
Ixp1 1 1 1 1 1 2 2 5 17 0 0 0
Ixp2 1 1 1 1 1 2 2 17 29 0 0 0
Jk1 1 1 1 1 11 0 7 15 0 0 0 0
Jk2 1 1 1 1 1 1 28 0 0 0 0 0
Jkx1 1 1 1 1 1 2 2 2 24 0 0 0
Jkx2 1 1 1 1 1 2 2 35 0 0 0 0
Jx1 1 1 1 1 1 1 1 4 5 16 16 0
Jx2 1 1 1 1 1 9 0 0 0 0 0 0
Jxk1 1 1 1 1 1 2 4 9 14 22 0 0
Jxk2 1 1 1 1 1 2 2 38 0 0 0 0
Kk1 1 1 1 1 1 1 1 8 0 0 0 0
Kk2 1 1 1 1 1 1 1 1 1 2 2 2
Kkl1 3 3 3 3 4 7 27 0 0 0 0 0
Kkl2 4 4 5 0 0 7 20 41 0 0 0 0
Kl1 1 1 1 1 1 1 1 1 9 12 12 0
Kl2 1 1 1 1 1 1 1 1 1 1 1 1
Klk1 3 3 3 3 4 4 0 0 0 0 0 0
Klk2 4 4 5 5 13 0 0 0 0 0 0 0
Lk1 1 1 1 1 1 1 12 4 30 0 0 0
Lk2 1 1 1 1 1 1 1 1 10 13 13 0
Lkl1 1 1 1 1 3 7 0 0 25 0 0 0
Lkl2 3 3 3 3 3 7 20 41 0 0 0 0
Ll1 1 1 1 1 1 1 1 11 19 26 0 0
Ll2 1 1 1 1 1 1 9 7 0 0 0 0
Ll3 1 1 1 1 1 1 1 21 27 0 0 0
Ll4 1 1 1 1 8 10 24 0 0 0 0 0
Llk1 3 3 3 3 13 0 0 0 0 0 0 0
Llk2 3 3 3 3 3 7 27 43 0 0 0 0
Llp1 4 4 6 9 0 0 0 0 0 0 0 0
Llp2 3 3 3 3 3 8 8 25 40 0 0 0
Lp1 1 1 1 1 1 14 0 0 0 0 0 0
Lp2 1 1 1 1 1 1 1 16 0 0 0 0
Lp3 1 1 1 1 1 1 1 16 12 25 0 0
Lp4 1 1 1 1 1 1 1 1 8 12 0 0
Lpl1 4 4 6 0 0 0 0 0 0 0 0 0
Lpl2 4 4 6 9 0 0 0 0 0 0 0 0
Lpv1 3 3 3 3 3 8 8 0 0 0 0 0
Lpv2 3 3 3 3 3 8 8 17 29 0 0 0
Lv1 1 1 1 1 1 1 5 10 15 23 0 0
Lv2 1 1 1 1 1 1 5 44 0 0 0 0
Lvp1 3 3 3 3 3 8 8 0 0 0 0 0
Lvp2 3 3 3 3 3 2 8 25 40 0 0 0
Mc1 2 2 2 2 2 3 26 0 0 0 0 0
Mc2 2 2 2 2 2 3 3 3 4 6 6 0
Mc3 2 2 2 2 2 5 0 0 0 0 0 0
Mc4 0 0 2 2 0 0 0 0 0 0 0 0
Mch1 3 3 4 4 5 0 0 0 0 0 0 0
Mch2 5 5 4 4 5 6 4 9 14 22 0 0
Mcl1 1 1 1 1 1 2 2 2 2 21 0 0
Mcl2 1 1 1 1 1 2 2 2 2 18 0 0
Mh1 1 1 1 1 2 5 17 34 0 0 0 0
Mh2 1 1 1 1 1 1 7 45 0 0 0 0
Mh3 1 1 1 1 1 5 17 34 0 0 0 0
Mh4 1 1 1 1 1 5 17 0 0 0 0 0
Mhc1 3 3 4 4 5 0 0 0 38 0 0 0
Mhc2 3 3 3 3 5 6 0 22 34 28 0 0
Mhl1 1 1 1 1 1 2 2 14 25 0 0 0
Mhl2 1 1 1 1 1 2 2 2 2 18 0 0
Ml1 1 1 1 1 1 1 1 1 12 25 0 0
Ml2 1 1 1 1 1 1 7 45 41 0 0 0
Ml3 1 1 1 1 1 3 26 0 0 0 0 0
Ml4 1 1 1 1 8 14 0 0 0 0 0 0
Mlc1 1 1 1 1 1 1 1 2 8 11 11 0
Mlc2 1 1 1 1 1 2 2 2 6 8 8 0
Mlh1 1 1 1 1 1 2 2 20 38 0 0 0
Mlh2 1 1 1 1 1 1 2 2 18 21 0 0
Nk1 1 1 1 1 1 1 6 39 0 0 0 0
Nk2 1 1 1 1 1 1 18 36 0 0 0 0
Nkp1 1 1 1 1 4 0 0 0 18 0 0 0
Nkp2 5 5 4 6 4 2 21 42 0 0 0 0
Np1 1 1 1 1 1 1 1 1 16 2 0 0
Np2 1 1 1 1 1 1 9 11 19 26 0 0
Npk1 3 3 3 3 4 4 10 19 32 0 0 0
Npk2 3 3 3 3 4 4 10 19 32 0 0 0

These results suggest that there probably are more than 1 scales, and that some items behave very similarly, but before deciding if we continue we continue in this direction, let’s see what PCA/FA might suggest.

2.4.3 PCA and FA

For PCA, PC1 explains 20.1% of the variance, followed by PC2 which explains 5.5%, suggesting there is a main factor on which most items load, but the story is bit more complex, with at least a 2nd factor needed (and a lot of variation remaining unexplained by these 2 components):

**Figure S25.** Screeplot of the PCA of all the tone items together. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS25'/>

Figure S25. Screeplot of the PCA of all the tone items together. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S26.** Loading of the tone items on the first 2 PCs. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS26'/>

Figure S26. Loading of the tone items on the first 2 PCs. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S27.** The participants plotted on the first 2 PCs, colored by their their qualities of representation (cos2). Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS27'/>

Figure S27. The participants plotted on the first 2 PCs, colored by their their qualities of representation (cos2). Figure generated using R version 4.3.3 (2024-02-29)

The actual loadings on the first two PCs are:

PC1 PC2
Ak1 -0.07 -0.05
Ak2 -0.08 -0.06
Akt1 -0.07 0.09
Akt2 -0.07 0.09
At1 -0.08 -0.03
At2 -0.08 -0.04
Atk1 -0.06 0.10
Atk2 -0.06 0.09
Bh1 -0.08 -0.03
Bh2 -0.07 -0.04
Bhx1 0.05 0.06
Bhx2 0.04 0.06
Bs1 -0.09 -0.04
Bs2 -0.09 -0.05
Bsv1 -0.07 0.08
Bsv2 -0.07 0.10
Bv1 -0.08 -0.05
Bv2 -0.08 -0.04
Bvs1 -0.07 0.08
Bvs2 -0.08 0.08
Bx1 -0.07 -0.03
Bx2 -0.10 -0.04
Bxh1 0.06 0.01
Bxh2 0.04 0.04
Ch1 -0.08 -0.07
Ch2 -0.10 -0.05
Chv1 -0.04 0.09
Chv2 -0.05 0.09
Cv1 -0.07 -0.06
Cv2 -0.07 -0.04
Cvh1 -0.07 0.09
Cvh2 -0.07 0.07
Dc1 -0.07 -0.02
Dc2 -0.07 -0.07
Dc3 -0.09 -0.07
Dc4 -0.08 -0.05
Dch1 -0.01 0.10
Dch2 -0.03 0.09
Dcx1 -0.05 0.10
Dcx2 -0.05 0.10
Dh1 -0.07 -0.04
Dh2 -0.06 -0.07
Dhc1 0.01 0.08
Dhc2 0.00 0.10
Dx1 -0.07 -0.07
Dx2 -0.08 -0.06
Dxc1 -0.06 0.09
Dxc2 -0.05 0.10
Ec1 -0.08 -0.06
Ec2 -0.08 -0.05
Ec3 -0.07 -0.06
Ec4 -0.08 -0.08
Ecl1 -0.08 0.08
Ecl2 -0.08 0.09
Ecx1 -0.05 0.08
Ecx2 -0.05 0.10
El1 -0.07 -0.06
El2 -0.08 -0.07
El3 -0.07 -0.06
El4 -0.08 -0.08
Elc1 -0.09 0.08
Elc2 -0.07 0.10
Elx1 -0.09 0.05
Elx2 -0.09 0.07
Ex1 -0.08 -0.02
Ex2 -0.06 -0.04
Ex3 -0.09 -0.09
Ex4 -0.08 -0.06
Exc1 -0.04 0.09
Exc2 -0.04 0.13
Exl1 -0.09 0.07
Exl2 -0.10 0.06
Fs1 -0.09 -0.05
Fs2 -0.09 -0.03
Fsv1 -0.08 0.07
Fsv2 -0.08 0.09
Fv1 -0.09 -0.03
Fv2 -0.08 -0.01
Fvs1 -0.09 0.07
Fvs2 -0.07 0.09
Gt1 -0.07 -0.04
Gt2 -0.08 -0.05
Gtx1 -0.09 0.07
Gtx2 -0.08 0.09
Gx1 -0.06 -0.03
Gx2 -0.08 -0.04
Gxt1 -0.08 0.08
Gxt2 -0.08 0.09
Hs1 -0.07 -0.04
Hs2 -0.07 -0.06
Hst1 0.05 0.06
Hst2 0.05 0.07
Ht1 -0.08 -0.05
Ht2 -0.08 -0.04
Hts1 0.05 0.05
Hts2 0.05 0.06
Ic1 -0.07 -0.05
Ic2 -0.09 -0.05
Icp1 -0.06 0.08
Icp2 -0.07 0.07
Ik1 -0.07 -0.04
Ik2 -0.08 -0.05
Iks1 -0.06 0.09
Iks2 -0.07 0.09
Ip1 -0.05 -0.02
Ip2 -0.08 -0.05
Ip3 -0.05 -0.05
Ip4 -0.08 -0.05
Ip5 -0.08 -0.06
Ip6 -0.08 -0.03
Ipc1 -0.08 0.08
Ipc2 -0.08 0.08
Ipv1 -0.05 0.08
Ipv2 -0.05 0.10
Ipx1 -0.08 0.07
Ipx2 -0.08 0.07
Is1 -0.09 -0.02
Is2 -0.08 -0.01
Is3 -0.09 -0.05
Is4 -0.09 -0.04
Isk1 -0.07 0.05
Isk2 -0.07 0.09
Ist1 0.05 0.05
Ist2 0.05 0.06
It1 -0.08 -0.05
It2 -0.07 -0.03
Its1 0.04 0.05
Its2 0.07 0.07
Iv1 -0.09 -0.05
Iv2 -0.08 -0.06
Ivp1 -0.05 0.10
Ivp2 -0.05 0.09
Ix1 -0.07 -0.06
Ix2 -0.08 -0.02
Ixp1 -0.08 0.07
Ixp2 -0.07 0.09
Jk1 -0.07 -0.02
Jk2 -0.08 -0.04
Jkx1 -0.08 0.08
Jkx2 -0.08 0.08
Jx1 -0.08 -0.05
Jx2 -0.07 -0.02
Jxk1 -0.07 0.07
Jxk2 -0.07 0.09
Kk1 -0.08 -0.04
Kk2 -0.08 -0.08
Kkl1 -0.01 0.08
Kkl2 0.01 0.07
Kl1 -0.08 -0.06
Kl2 -0.09 -0.05
Klk1 0.03 0.09
Klk2 0.03 0.08
Lk1 -0.08 -0.03
Lk2 -0.09 -0.06
Lkl1 -0.04 0.10
Lkl2 -0.03 0.11
Ll1 -0.09 -0.02
Ll2 -0.07 -0.06
Ll3 -0.08 -0.07
Ll4 -0.07 -0.06
Llk1 -0.02 0.09
Llk2 -0.02 0.10
Llp1 0.04 0.08
Llp2 0.02 0.08
Lp1 -0.07 -0.02
Lp2 -0.08 -0.05
Lp3 -0.08 -0.06
Lp4 -0.09 -0.05
Lpl1 0.03 0.06
Lpl2 0.03 0.06
Lpv1 -0.03 0.12
Lpv2 -0.04 0.12
Lv1 -0.07 -0.07
Lv2 -0.08 -0.06
Lvp1 -0.04 0.11
Lvp2 -0.04 0.13
Mc1 -0.03 -0.05
Mc2 -0.03 -0.08
Mc3 -0.04 -0.07
Mc4 -0.01 -0.07
Mch1 -0.02 0.08
Mch2 -0.04 0.08
Mcl1 -0.08 0.07
Mcl2 -0.09 0.08
Mh1 -0.06 -0.06
Mh2 -0.07 -0.05
Mh3 -0.07 -0.05
Mh4 -0.07 -0.05
Mhc1 -0.03 0.07
Mhc2 -0.03 0.09
Mhl1 -0.08 0.08
Mhl2 -0.09 0.10
Ml1 -0.08 -0.06
Ml2 -0.07 -0.07
Ml3 -0.07 -0.06
Ml4 -0.05 -0.05
Mlc1 -0.09 0.04
Mlc2 -0.08 0.10
Mlh1 -0.08 0.06
Mlh2 -0.09 0.07
Nk1 -0.08 -0.04
Nk2 -0.08 -0.03
Nkp1 -0.04 0.09
Nkp2 -0.04 0.10
Np1 -0.09 -0.05
Np2 -0.07 -0.04
Npk1 -0.03 0.10
Npk2 -0.03 0.09

It can be seen that:

  • the repetitions of the same item tend to have very similar loadings,
  • PC1 which opposes Its, Bxh, Ist, Hst, Llp, Klk (all presentations and variants except for Kkl1 and Kkl2 which are close to 0.0; these are all “weird” items) to all the others (excluding Dhc1, Dhc2, Dch1, Kkl1 and Mc4 which are close to 0.0),
  • PC2 opposes the ‘same’ to the ‘different’ stimuli.

For EFA, all the preliminary tests suggest that factor analysis is appropriate, with the possible exception of a determinant very close to 0.0 (Kaiser-Meyer-Olkin = 0.89 > 0.60; Bartlett’s test is significant: χ2(21528)=61167.2, p=0; and det(cor(data))=7.5e-64 > 0). However, when it comes to the best number of factors, the various methods diverge, but the overall story seems to be that 1 or 2 factors might be enough (but there seems to be a lot of variation beyond this as well, just as in the case of the PCA):

**Figure S28.** Screeplot of the observed, simulated and randomized data with 1 standard deviation error bars (as generated by `fa.parallel())`. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS28'/>

Figure S28. Screeplot of the observed, simulated and randomized data with 1 standard deviation error bars (as generated by fa.parallel()). Figure generated using R version 4.3.3 (2024-02-29)

**Figure S29.** Number of factors as suggested by the VSS criterion (top left), the complexity of the solution (top right), BIC (bottom left) and Root Mean Residual (bottom right), as implemented by `nfactors()`. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS29'/>

Figure S29. Number of factors as suggested by the VSS criterion (top left), the complexity of the solution (top right), BIC (bottom left) and Root Mean Residual (bottom right), as implemented by nfactors(). Figure generated using R version 4.3.3 (2024-02-29)

It seems that a 2FA model is better than a 1FA while a 3rd factor does not improve the fit much, but the fit to the data is far from perfect:
**Figure S30.** Loadings of the variables in the 2-factors model. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS30'/>

Figure S30. Loadings of the variables in the 2-factors model. Figure generated using R version 4.3.3 (2024-02-29)

and the actual loadings on the two factors are (showing only those ≥ 0.1 in absolute value):


Loadings:
     Factor1 Factor2
Ak1   0.461         
Ak2   0.516         
Akt1          0.499 
Akt2          0.522 
At1   0.482   0.126 
At2   0.472         
Atk1          0.512 
Atk2          0.477 
Bh1   0.450   0.139 
Bh2   0.447         
Bhx1 -0.376         
Bhx2 -0.373         
Bs1   0.535   0.128 
Bs2   0.570         
Bsv1          0.495 
Bsv2          0.533 
Bv1   0.493         
Bv2   0.482         
Bvs1  0.118   0.483 
Bvs2  0.116   0.513 
Bx1   0.416         
Bx2   0.569   0.130 
Bxh1 -0.309  -0.129 
Bxh2 -0.291         
Ch1   0.557         
Ch2   0.585   0.109 
Chv1          0.406 
Chv2          0.426 
Cv1   0.489         
Cv2   0.459         
Cvh1          0.479 
Cvh2  0.127   0.420 
Dc1   0.392   0.128 
Dc2   0.530         
Dc3   0.638         
Dc4   0.523         
Dch1 -0.216   0.338 
Dch2          0.362 
Dcx1          0.479 
Dcx2          0.467 
Dh1   0.442         
Dh2   0.469         
Dhc1 -0.246   0.230 
Dhc2 -0.252   0.327 
Dx1   0.531         
Dx2   0.536         
Dxc1          0.476 
Dxc2          0.475 
Ec1   0.507         
Ec2   0.506         
Ec3   0.497         
Ec4   0.628         
Ecl1  0.144   0.493 
Ecl2  0.108   0.514 
Ecx1          0.399 
Ecx2          0.464 
El1   0.469         
El2   0.565         
El3   0.506         
El4   0.578         
Elc1  0.173   0.528 
Elc2          0.564 
Elx1  0.275   0.437 
Elx2  0.206   0.511 
Ex1   0.406   0.139 
Ex2   0.390         
Ex3   0.647         
Ex4   0.534         
Exc1          0.420 
Exc2 -0.167   0.513 
Exl1  0.205   0.490 
Exl2  0.264   0.495 
Fs1   0.557         
Fs2   0.516   0.153 
Fsv1  0.164   0.461 
Fsv2  0.114   0.543 
Fv1   0.504   0.144 
Fv2   0.369   0.206 
Fvs1  0.196   0.501 
Fvs2          0.510 
Gt1   0.452         
Gt2   0.543         
Gtx1  0.180   0.509 
Gtx2  0.114   0.566 
Gx1   0.357         
Gx2   0.478         
Gxt1  0.141   0.491 
Gxt2  0.119   0.538 
Hs1   0.447         
Hs2   0.520         
Hst1 -0.414         
Hst2 -0.388         
Ht1   0.507         
Ht2   0.494   0.113 
Hts1 -0.361         
Hts2 -0.386         
Ic1   0.492         
Ic2   0.547         
Icp1          0.437 
Icp2  0.145   0.431 
Ik1   0.429         
Ik2   0.491         
Iks1          0.469 
Iks2          0.536 
Ip1   0.298         
Ip2   0.492         
Ip3   0.385         
Ip4   0.507         
Ip5   0.506         
Ip6   0.435   0.137 
Ipc1  0.117   0.506 
Ipc2  0.155   0.498 
Ipv1          0.382 
Ipv2          0.451 
Ipx1  0.156   0.464 
Ipx2  0.166   0.481 
Is1   0.472   0.186 
Is2   0.415   0.189 
Is3   0.556         
Is4   0.513   0.116 
Isk1  0.155   0.371 
Isk2          0.512 
Ist1 -0.389         
Ist2 -0.397         
It1   0.501         
It2   0.404         
Its1 -0.349         
Its2 -0.490         
Iv1   0.575         
Iv2   0.538         
Ivp1          0.475 
Ivp2          0.435 
Ix1   0.482         
Ix2   0.394   0.175 
Ixp1  0.162   0.496 
Ixp2          0.528 
Jk1   0.374   0.116 
Jk2   0.469   0.107 
Jkx1  0.156   0.520 
Jkx2  0.104   0.518 
Jx1   0.512         
Jx2   0.383   0.144 
Jxk1  0.128   0.452 
Jxk2          0.518 
Kk1   0.493   0.105 
Kk2   0.610         
Kkl1 -0.153   0.264 
Kkl2 -0.250   0.178 
Kl1   0.535         
Kl2   0.557         
Klk1 -0.349   0.197 
Klk2 -0.321   0.159 
Lk1   0.480   0.127 
Lk2   0.570         
Lkl1          0.437 
Lkl2 -0.144   0.429 
Ll1   0.451   0.173 
Ll2   0.521         
Ll3   0.571         
Ll4   0.489         
Llk1 -0.133   0.345 
Llk2 -0.141   0.371 
Llp1 -0.386   0.145 
Llp2 -0.301   0.228 
Lp1   0.387   0.127 
Lp2   0.510         
Lp3   0.544         
Lp4   0.528         
Lpl1 -0.285   0.105 
Lpl2 -0.311   0.101 
Lpv1 -0.169   0.467 
Lpv2 -0.154   0.489 
Lv1   0.537         
Lv2   0.529         
Lvp1 -0.114   0.458 
Lvp2 -0.180   0.515 
Mc1   0.276         
Mc2   0.341  -0.162 
Mc3   0.361  -0.128 
Mc4   0.218  -0.173 
Mch1 -0.141   0.303 
Mch2          0.342 
Mcl1  0.183   0.489 
Mcl2  0.151   0.536 
Mh1   0.459         
Mh2   0.483         
Mh3   0.483         
Mh4   0.463         
Mhc1          0.290 
Mhc2 -0.101   0.366 
Mhl1  0.122   0.495 
Mhl2  0.131   0.595 
Ml1   0.549         
Ml2   0.518         
Ml3   0.501         
Ml4   0.396         
Mlc1  0.313   0.422 
Mlc2          0.576 
Mlh1  0.184   0.424 
Mlh2  0.230   0.504 
Nk1   0.471   0.103 
Nk2   0.425   0.128 
Nkp1          0.408 
Nkp2          0.452 
Np1   0.543         
Np2   0.407         
Npk1 -0.118   0.411 
Npk2          0.364 

               Factor1 Factor2
SS loadings     28.929  19.434
Proportion Var   0.139   0.093
Cumulative Var   0.139   0.233

It can be seen that:

  • the repetitions of the same item tend to have very similar loadings,
  • FA1 is composed of all the ‘same’ items (all presentations, and all with + loadings; please note that the actual sign is arbitrary but the differences in signs matter) and a few ‘different’ items: Bhx, Hst, Ist, Kkl and Llp (all presentations and variants, all with - loadings) – it is interesting to note that all these items are “weird”,
  • FA2 is composed of most of the ‘different’ items: Akt, Bsv, Chv, Dch, Dcx, Ecl, Ecx, Elx, Fsv, Gtx, Icp, Iks, Ipv, Ipx, Isk, Jkx, Lkl, Lpv, Mch, Mcl, Mhc, Mhl, and Nkp1 (all presentations and variants, all with + loadings).

Thus, PC1 basically opposes the “weird” items to all the other items, while PC2 opposes the ‘same’ to the ‘different’ items. Likewise, it seems that FA1 really captures the ‘same’ items and the “weird” ‘different’ items (but with opposite signs), while FA2 captures the “normal” ‘different’ items.

These observations prompt the question: are the “weird” items really of the ‘same’ type?

2.4.4 Are the “weird” items really of the ‘same’ type?

All these together suggest the hypothesis that the “weird” stimuli, while designed as ‘different’, are, in fact, perceived as (albeit rather difficult) ‘same’ items by the participants. If true, this hypothesis implies that coding them as such (i.e., in fact flipping their “correct” and “incorrect” responses) should align them with the other ‘same’ items.

2.4.4.1 Percent correct responses

Table S12. ‘Flipped’ weird items: Frequencies of ‘yes’ responses (the items are ordered by % correct responses).
Item (short name) # correct responses % correct responses
Dhc1 233 47.4%
Dhc2 239 48.6%
Dch2 266 54.1%
Mhc2 281 57.1%
Dch1 285 57.9%
Mhc1 288 58.5%
Llk2 292 59.3%
Kkl1 293 59.6%
Llk1 301 61.2%
Mch2 303 61.6%
Npk2 303 61.6%
Mch1 306 62.2%
Kkl2 311 63.2%
Lkl2 313 63.6%
Ipv1 326 66.3%
Ipv2 333 67.7%
Npk1 336 68.3%
Lpv1 337 68.5%
Nkp2 341 69.3%
Lpv2 343 69.7%
Ivp2 352 71.5%
Lvp2 353 71.7%
Klk2 357 72.6%
Lkl1 357 72.6%
Mc1 357 72.6%
Nkp1 357 72.6%
Lpl1 358 72.8%
Ivp1 360 73.2%
Mc2 360 73.2%
Lvp1 362 73.6%
Exc2 363 73.8%
Klk1 364 74.0%
Dxc2 366 74.4%
Mc3 366 74.4%
Lpl2 369 75.0%
Llp2 370 75.2%
Ecx2 371 75.4%
Dxc1 372 75.6%
Ecx1 373 75.8%
Bhx1 376 76.4%
Exc1 379 77.0%
Llp1 379 77.0%
Bhx2 382 77.6%
Chv1 384 78.0%
Atk1 386 78.5%
Mc4 386 78.5%
Bxh1 388 78.9%
Atk2 390 79.3%
Dcx2 390 79.3%
Bxh2 393 79.9%
Hts1 395 80.3%
Chv2 400 81.3%
Ist1 402 81.7%
Ist2 403 81.9%
Bsv2 404 82.1%
Its1 404 82.1%
Akt1 405 82.3%
Dcx1 406 82.5%
Bx1 407 82.7%
Isk2 410 83.3%
Hts2 412 83.7%
Akt2 413 83.9%
Ex1 413 83.9%
Iks2 413 83.9%
Jx1 414 84.1%
Elc2 415 84.3%
Hst1 415 84.3%
Icp2 415 84.3%
Iks1 415 84.3%
Ixp2 416 84.6%
Cvh1 417 84.8%
Gx2 417 84.8%
Ipc2 417 84.8%
Hst2 418 85.0%
Isk1 418 85.0%
Gx1 419 85.2%
Bv1 421 85.6%
Fsv2 422 85.8%
Bsv1 423 86.0%
Elx1 423 86.0%
Gxt2 423 86.0%
Jxk2 423 86.0%
Cvh2 424 86.2%
Ecl2 424 86.2%
Elc1 424 86.2%
Exl2 424 86.2%
Its2 424 86.2%
Ml1 425 86.4%
Elx2 426 86.6%
Ex3 426 86.6%
Ipx2 426 86.6%
Ll1 426 86.6%
Bx2 427 86.8%
Ecl1 427 86.8%
Fsv1 427 86.8%
Bvs2 428 87.0%
Lk2 428 87.0%
Fvs2 429 87.2%
Ll2 429 87.2%
Dx1 430 87.4%
Fvs1 430 87.4%
Gtx2 430 87.4%
Jk1 430 87.4%
Jxk1 430 87.4%
Mh2 430 87.4%
Mhl2 430 87.4%
Mlh2 430 87.4%
Icp1 431 87.6%
Mh1 431 87.6%
Bh1 432 87.8%
Dh1 432 87.8%
Ip1 432 87.8%
Ip2 432 87.8%
Ixp1 432 87.8%
Jkx1 432 87.8%
Jkx2 432 87.8%
Ll3 432 87.8%
Mlc1 432 87.8%
Bv2 433 88.0%
Ex4 433 88.0%
Fv1 433 88.0%
Jx2 433 88.0%
Mcl2 433 88.0%
Mhl1 433 88.0%
Ch1 434 88.2%
Ipc1 434 88.2%
Lk1 434 88.2%
Np1 434 88.2%
Bvs1 435 88.4%
Exl1 435 88.4%
Ic1 435 88.4%
Is1 435 88.4%
Ix1 435 88.4%
Ll4 435 88.4%
Mlc2 435 88.4%
Ak2 436 88.6%
Lv2 436 88.6%
El3 438 89.0%
Ip6 438 89.0%
Ipx1 438 89.0%
Bs1 439 89.2%
Lp1 439 89.2%
Dx2 440 89.4%
El2 440 89.4%
Ip4 440 89.4%
Lv1 440 89.4%
Mcl1 440 89.4%
Mh3 440 89.4%
Mlh1 440 89.4%
Ch2 441 89.6%
Cv1 441 89.6%
El4 441 89.6%
Ak1 442 89.8%
Gxt1 442 89.8%
Ht2 442 89.8%
Ip5 442 89.8%
It2 442 89.8%
Lp2 442 89.8%
Ml3 442 89.8%
Ex2 443 90.0%
Ic2 443 90.0%
Ip3 443 90.0%
Iv2 443 90.0%
Kk1 443 90.0%
Nk1 443 90.0%
At1 444 90.2%
Gtx1 444 90.2%
Is2 444 90.2%
Ix2 444 90.2%
Bs2 445 90.4%
Dc3 445 90.4%
El1 445 90.4%
Cv2 446 90.7%
Fs1 446 90.7%
Ht1 446 90.7%
Ik2 446 90.7%
Dc4 447 90.9%
It1 447 90.9%
Iv1 447 90.9%
Jk2 447 90.9%
Lp3 448 91.1%
Mh4 448 91.1%
Ml2 448 91.1%
Dc1 449 91.3%
Dh2 449 91.3%
Ik1 449 91.3%
Is4 449 91.3%
Bh2 450 91.5%
Ec1 450 91.5%
Ec2 450 91.5%
Gt1 450 91.5%
Lp4 450 91.5%
Fs2 451 91.7%
Nk2 451 91.7%
Ec4 452 91.9%
Is3 452 91.9%
Ml4 453 92.1%
Gt2 454 92.3%
Hs1 454 92.3%
Ec3 455 92.5%
Fv2 455 92.5%
Dc2 456 92.7%
At2 457 92.9%
Kl2 457 92.9%
Np2 459 93.3%
Kk2 461 93.7%
Kl1 461 93.7%
Hs2 463 94.1%
**Figure S31.** 'Flipped' weird items: Endorsement frequencies by item (items ordered by % correct responses). Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS31'/>

Figure S31. ‘Flipped’ weird items: Endorsement frequencies by item (items ordered by % correct responses). Figure generated using R version 4.3.3 (2024-02-29)

2.4.4.2 Correlations between items

**Figure S32.** 'Flipped' weird items: Correlation matrix between items. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS32'/>

Figure S32. ‘Flipped’ weird items: Correlation matrix between items. Figure generated using R version 4.3.3 (2024-02-29)

It can be seen that the tetrachoric correlations between the items are rather low, varying between -0.34 and 0.69, with a mean of 0.3, a median of 0.32, sd of 0.16 and IQR of 0.23:
**Figure S33.** Histogram of the tetracoric correlations between different itmes. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS33'/>

Figure S33. Histogram of the tetracoric correlations between different itmes. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S34.** 'Flipped' weird items: Hierarchical clustering of the items using 1 - tetrachoric correlations. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS34'/>

Figure S34. ‘Flipped’ weird items: Hierarchical clustering of the items using 1 - tetrachoric correlations. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S35.** 'Flipped' weird items: Mean correlation with the other items. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS35'/>

Figure S35. ‘Flipped’ weird items: Mean correlation with the other items. Figure generated using R version 4.3.3 (2024-02-29)

2.4.4.3 PCA and FA

For PCA, PC1 explains 20.1% of the variance, followed by PC2 which explains 5.5%, suggesting there is a main factor on which most items load, but the story is bit more complex, with at least a 2nd factor needed (and a lot of variation remaining unexplained by these 2 components):

**Figure S36.** 'Flipped' weird items: Screeplot of the PCA of all the tone items together. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS36'/>

Figure S36. ‘Flipped’ weird items: Screeplot of the PCA of all the tone items together. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S37.** 'Flipped' weird items: Loading of the tone items on the first 2 PCs. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS37'/>

Figure S37. ‘Flipped’ weird items: Loading of the tone items on the first 2 PCs. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S38.** 'Flipped' weird items: The participants plotted on the first 2 PCs, colored by their their qualities of representation (cos2). Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS38'/>

Figure S38. ‘Flipped’ weird items: The participants plotted on the first 2 PCs, colored by their their qualities of representation (cos2). Figure generated using R version 4.3.3 (2024-02-29)

The actual loadings on the first two PCs are:

PC1 PC2
Ak1 -0.07 -0.05
Ak2 -0.08 -0.06
Akt1 -0.07 0.09
Akt2 -0.07 0.09
At1 -0.08 -0.03
At2 -0.08 -0.04
Atk1 -0.06 0.10
Atk2 -0.06 0.09
Bh1 -0.08 -0.03
Bh2 -0.07 -0.04
Bhx1 -0.05 -0.06
Bhx2 -0.04 -0.06
Bs1 -0.09 -0.04
Bs2 -0.09 -0.05
Bsv1 -0.07 0.08
Bsv2 -0.07 0.10
Bv1 -0.08 -0.05
Bv2 -0.08 -0.04
Bvs1 -0.07 0.08
Bvs2 -0.08 0.08
Bx1 -0.07 -0.03
Bx2 -0.10 -0.04
Bxh1 -0.06 -0.01
Bxh2 -0.04 -0.04
Ch1 -0.08 -0.07
Ch2 -0.10 -0.05
Chv1 -0.04 0.09
Chv2 -0.05 0.09
Cv1 -0.07 -0.06
Cv2 -0.07 -0.04
Cvh1 -0.07 0.09
Cvh2 -0.07 0.07
Dc1 -0.07 -0.02
Dc2 -0.07 -0.07
Dc3 -0.09 -0.07
Dc4 -0.08 -0.05
Dch1 -0.01 0.10
Dch2 -0.03 0.09
Dcx1 -0.05 0.10
Dcx2 -0.05 0.10
Dh1 -0.07 -0.04
Dh2 -0.06 -0.07
Dhc1 0.01 0.08
Dhc2 0.00 0.10
Dx1 -0.07 -0.07
Dx2 -0.08 -0.06
Dxc1 -0.06 0.09
Dxc2 -0.05 0.10
Ec1 -0.08 -0.06
Ec2 -0.08 -0.05
Ec3 -0.07 -0.06
Ec4 -0.08 -0.08
Ecl1 -0.08 0.08
Ecl2 -0.08 0.09
Ecx1 -0.05 0.08
Ecx2 -0.05 0.10
El1 -0.07 -0.06
El2 -0.08 -0.07
El3 -0.07 -0.06
El4 -0.08 -0.08
Elc1 -0.09 0.08
Elc2 -0.07 0.10
Elx1 -0.09 0.05
Elx2 -0.09 0.07
Ex1 -0.08 -0.02
Ex2 -0.06 -0.04
Ex3 -0.09 -0.09
Ex4 -0.08 -0.06
Exc1 -0.04 0.09
Exc2 -0.04 0.13
Exl1 -0.09 0.07
Exl2 -0.10 0.06
Fs1 -0.09 -0.05
Fs2 -0.09 -0.03
Fsv1 -0.08 0.07
Fsv2 -0.08 0.09
Fv1 -0.09 -0.03
Fv2 -0.08 -0.01
Fvs1 -0.09 0.07
Fvs2 -0.07 0.09
Gt1 -0.07 -0.04
Gt2 -0.08 -0.05
Gtx1 -0.09 0.07
Gtx2 -0.08 0.09
Gx1 -0.06 -0.03
Gx2 -0.08 -0.04
Gxt1 -0.08 0.08
Gxt2 -0.08 0.09
Hs1 -0.07 -0.04
Hs2 -0.07 -0.06
Hst1 -0.05 -0.06
Hst2 -0.05 -0.07
Ht1 -0.08 -0.05
Ht2 -0.08 -0.04
Hts1 -0.05 -0.05
Hts2 -0.05 -0.06
Ic1 -0.07 -0.05
Ic2 -0.09 -0.05
Icp1 -0.06 0.08
Icp2 -0.07 0.07
Ik1 -0.07 -0.04
Ik2 -0.08 -0.05
Iks1 -0.06 0.09
Iks2 -0.07 0.09
Ip1 -0.05 -0.02
Ip2 -0.08 -0.05
Ip3 -0.05 -0.05
Ip4 -0.08 -0.05
Ip5 -0.08 -0.06
Ip6 -0.08 -0.03
Ipc1 -0.08 0.08
Ipc2 -0.08 0.08
Ipv1 -0.05 0.08
Ipv2 -0.05 0.10
Ipx1 -0.08 0.07
Ipx2 -0.08 0.07
Is1 -0.09 -0.02
Is2 -0.08 -0.01
Is3 -0.09 -0.05
Is4 -0.09 -0.04
Isk1 -0.07 0.05
Isk2 -0.07 0.09
Ist1 -0.05 -0.05
Ist2 -0.05 -0.06
It1 -0.08 -0.05
It2 -0.07 -0.03
Its1 -0.04 -0.05
Its2 -0.07 -0.07
Iv1 -0.09 -0.05
Iv2 -0.08 -0.06
Ivp1 -0.05 0.10
Ivp2 -0.05 0.09
Ix1 -0.07 -0.06
Ix2 -0.08 -0.02
Ixp1 -0.08 0.07
Ixp2 -0.07 0.09
Jk1 -0.07 -0.02
Jk2 -0.08 -0.04
Jkx1 -0.08 0.08
Jkx2 -0.08 0.08
Jx1 -0.08 -0.05
Jx2 -0.07 -0.02
Jxk1 -0.07 0.07
Jxk2 -0.07 0.09
Kk1 -0.08 -0.04
Kk2 -0.08 -0.08
Kkl1 0.01 -0.08
Kkl2 -0.01 -0.07
Kl1 -0.08 -0.06
Kl2 -0.09 -0.05
Klk1 -0.03 -0.09
Klk2 -0.03 -0.08
Lk1 -0.08 -0.03
Lk2 -0.09 -0.06
Lkl1 -0.04 0.10
Lkl2 -0.03 0.11
Ll1 -0.09 -0.02
Ll2 -0.07 -0.06
Ll3 -0.08 -0.07
Ll4 -0.07 -0.06
Llk1 -0.02 0.09
Llk2 -0.02 0.10
Llp1 -0.04 -0.08
Llp2 -0.02 -0.08
Lp1 -0.07 -0.02
Lp2 -0.08 -0.05
Lp3 -0.08 -0.06
Lp4 -0.09 -0.05
Lpl1 -0.03 -0.06
Lpl2 -0.03 -0.06
Lpv1 -0.03 0.12
Lpv2 -0.04 0.12
Lv1 -0.07 -0.07
Lv2 -0.08 -0.06
Lvp1 -0.04 0.11
Lvp2 -0.04 0.13
Mc1 -0.03 -0.05
Mc2 -0.03 -0.08
Mc3 -0.04 -0.07
Mc4 -0.01 -0.07
Mch1 -0.02 0.08
Mch2 -0.04 0.08
Mcl1 -0.08 0.07
Mcl2 -0.09 0.08
Mh1 -0.06 -0.06
Mh2 -0.07 -0.05
Mh3 -0.07 -0.05
Mh4 -0.07 -0.05
Mhc1 -0.03 0.07
Mhc2 -0.03 0.09
Mhl1 -0.08 0.08
Mhl2 -0.09 0.10
Ml1 -0.08 -0.06
Ml2 -0.07 -0.07
Ml3 -0.07 -0.06
Ml4 -0.05 -0.05
Mlc1 -0.09 0.04
Mlc2 -0.08 0.10
Mlh1 -0.08 0.06
Mlh2 -0.09 0.07
Nk1 -0.08 -0.04
Nk2 -0.08 -0.03
Nkp1 -0.04 0.09
Nkp2 -0.04 0.10
Np1 -0.09 -0.05
Np2 -0.07 -0.04
Npk1 -0.03 0.10
Npk2 -0.03 0.09

It can be seen that:

  • the repetitions of the same item tend to have very similar loadings,
  • PC1 is harder to interpret but seems to oppose the vast majority of the items to basically Kkl and Dhc (which have loadings close to 0), while
  • PC2 opposes the ‘same’ (including the ‘flipped’ items) to the ‘different’ items.

For EFA, all the preliminary tests suggest that factor analysis is appropriate, with the possible exception of a determinant very close to 0.0 (Kaiser-Meyer-Olkin = 0.89 > 0.60; Bartlett’s test is significant: χ2(21528)=61167.2, p=0; and det(cor(data))=7.5e-64 > 0). However, when it comes to the best number of factors, the various methods diverge, but the overall story seems to be that 1 or 2 factors might be enough (but there seems to be a lot of variation beyond this as well, just as in the case of the PCA):

**Figure S39.** 'Flipped' weird items: Screeplot of the observed, simulated and randomized data with 1 standard deviation error bars (as generated by `fa.parallel())`. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS39'/>

Figure S39. ‘Flipped’ weird items: Screeplot of the observed, simulated and randomized data with 1 standard deviation error bars (as generated by fa.parallel()). Figure generated using R version 4.3.3 (2024-02-29)

**Figure S40.** 'Flipped' weird items: Number of factors as suggested by the VSS criterion (top left), the complexity of the solution (top right), BIC (bottom left) and Root Mean Residual (bottom right), as implemented by `nfactors()`. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS40'/>

Figure S40. ‘Flipped’ weird items: Number of factors as suggested by the VSS criterion (top left), the complexity of the solution (top right), BIC (bottom left) and Root Mean Residual (bottom right), as implemented by nfactors(). Figure generated using R version 4.3.3 (2024-02-29)

It seems that a 2FA model is better than a 1FA while a 3rd factor does not improve the fit much, but the fit to the data is far from perfect:
**Figure S41.** 'Flipped' weird items: Loadings of the variables in the 2-factors model. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS41'/>

Figure S41. ‘Flipped’ weird items: Loadings of the variables in the 2-factors model. Figure generated using R version 4.3.3 (2024-02-29)

and the actual loadings on the two factors are (showing only those ≥ 0.1 in absolute value):


Loadings:
     Factor1 Factor2
Ak1   0.461         
Ak2   0.516         
Akt1          0.499 
Akt2          0.522 
At1   0.482   0.126 
At2   0.472         
Atk1          0.512 
Atk2          0.477 
Bh1   0.450   0.139 
Bh2   0.447         
Bhx1  0.376         
Bhx2  0.373         
Bs1   0.535   0.128 
Bs2   0.570         
Bsv1          0.495 
Bsv2          0.533 
Bv1   0.493         
Bv2   0.482         
Bvs1  0.118   0.483 
Bvs2  0.116   0.513 
Bx1   0.416         
Bx2   0.569   0.130 
Bxh1  0.309   0.129 
Bxh2  0.291         
Ch1   0.557         
Ch2   0.585   0.109 
Chv1          0.406 
Chv2          0.426 
Cv1   0.489         
Cv2   0.459         
Cvh1          0.479 
Cvh2  0.127   0.420 
Dc1   0.392   0.128 
Dc2   0.530         
Dc3   0.638         
Dc4   0.523         
Dch1 -0.216   0.338 
Dch2          0.362 
Dcx1          0.479 
Dcx2          0.467 
Dh1   0.442         
Dh2   0.469         
Dhc1 -0.246   0.230 
Dhc2 -0.252   0.327 
Dx1   0.531         
Dx2   0.536         
Dxc1          0.476 
Dxc2          0.475 
Ec1   0.507         
Ec2   0.506         
Ec3   0.497         
Ec4   0.628         
Ecl1  0.144   0.493 
Ecl2  0.108   0.514 
Ecx1          0.399 
Ecx2          0.464 
El1   0.469         
El2   0.565         
El3   0.506         
El4   0.578         
Elc1  0.173   0.528 
Elc2          0.564 
Elx1  0.275   0.437 
Elx2  0.206   0.511 
Ex1   0.406   0.139 
Ex2   0.390         
Ex3   0.647         
Ex4   0.534         
Exc1          0.420 
Exc2 -0.167   0.513 
Exl1  0.205   0.490 
Exl2  0.264   0.495 
Fs1   0.557         
Fs2   0.516   0.153 
Fsv1  0.164   0.461 
Fsv2  0.114   0.543 
Fv1   0.504   0.144 
Fv2   0.369   0.206 
Fvs1  0.196   0.501 
Fvs2          0.510 
Gt1   0.452         
Gt2   0.543         
Gtx1  0.180   0.509 
Gtx2  0.114   0.566 
Gx1   0.357         
Gx2   0.478         
Gxt1  0.141   0.491 
Gxt2  0.119   0.538 
Hs1   0.447         
Hs2   0.520         
Hst1  0.414         
Hst2  0.388         
Ht1   0.507         
Ht2   0.494   0.113 
Hts1  0.361         
Hts2  0.386         
Ic1   0.492         
Ic2   0.547         
Icp1          0.437 
Icp2  0.145   0.431 
Ik1   0.429         
Ik2   0.491         
Iks1          0.469 
Iks2          0.536 
Ip1   0.298         
Ip2   0.492         
Ip3   0.385         
Ip4   0.507         
Ip5   0.506         
Ip6   0.435   0.137 
Ipc1  0.117   0.506 
Ipc2  0.155   0.498 
Ipv1          0.382 
Ipv2          0.451 
Ipx1  0.156   0.464 
Ipx2  0.166   0.481 
Is1   0.472   0.186 
Is2   0.415   0.189 
Is3   0.556         
Is4   0.513   0.116 
Isk1  0.155   0.371 
Isk2          0.512 
Ist1  0.389         
Ist2  0.397         
It1   0.501         
It2   0.404         
Its1  0.349         
Its2  0.490         
Iv1   0.575         
Iv2   0.538         
Ivp1          0.475 
Ivp2          0.435 
Ix1   0.482         
Ix2   0.394   0.175 
Ixp1  0.162   0.496 
Ixp2          0.528 
Jk1   0.374   0.116 
Jk2   0.469   0.107 
Jkx1  0.156   0.520 
Jkx2  0.104   0.518 
Jx1   0.512         
Jx2   0.383   0.144 
Jxk1  0.128   0.452 
Jxk2          0.518 
Kk1   0.493   0.105 
Kk2   0.610         
Kkl1  0.153  -0.264 
Kkl2  0.250  -0.178 
Kl1   0.535         
Kl2   0.557         
Klk1  0.349  -0.197 
Klk2  0.321  -0.159 
Lk1   0.480   0.127 
Lk2   0.570         
Lkl1          0.437 
Lkl2 -0.144   0.429 
Ll1   0.451   0.173 
Ll2   0.521         
Ll3   0.571         
Ll4   0.489         
Llk1 -0.133   0.345 
Llk2 -0.141   0.371 
Llp1  0.386  -0.145 
Llp2  0.301  -0.228 
Lp1   0.387   0.127 
Lp2   0.510         
Lp3   0.544         
Lp4   0.528         
Lpl1  0.285  -0.105 
Lpl2  0.311  -0.101 
Lpv1 -0.169   0.467 
Lpv2 -0.154   0.489 
Lv1   0.537         
Lv2   0.529         
Lvp1 -0.114   0.458 
Lvp2 -0.180   0.515 
Mc1   0.276         
Mc2   0.341  -0.162 
Mc3   0.361  -0.128 
Mc4   0.218  -0.173 
Mch1 -0.141   0.303 
Mch2          0.342 
Mcl1  0.183   0.489 
Mcl2  0.151   0.536 
Mh1   0.459         
Mh2   0.483         
Mh3   0.483         
Mh4   0.463         
Mhc1          0.290 
Mhc2 -0.101   0.366 
Mhl1  0.122   0.495 
Mhl2  0.131   0.595 
Ml1   0.549         
Ml2   0.518         
Ml3   0.501         
Ml4   0.396         
Mlc1  0.313   0.422 
Mlc2          0.576 
Mlh1  0.184   0.424 
Mlh2  0.230   0.504 
Nk1   0.471   0.103 
Nk2   0.425   0.128 
Nkp1          0.408 
Nkp2          0.452 
Np1   0.543         
Np2   0.407         
Npk1 -0.118   0.411 
Npk2          0.364 

               Factor1 Factor2
SS loadings     28.929  19.434
Proportion Var   0.139   0.093
Cumulative Var   0.139   0.233

It can be seen that:

  • the repetitions of the same item tend to have very similar loadings,
  • FA1 is composed of all the ‘same’ items, including the “flipped” items Bhx, Hst, Ist, Kkl and Llp (all presentations, and all with loading of the same sign),
  • FA2 is composed of all the ‘different’ items (all presentations and variants, all with the same sign).

Thus, it seems that the main difference now is between the (“extended”) ‘same’ and the ‘different’ items, which seems to make much more sense.

2.4.4.4 MSA

Returning to the Mokken analysis:

Table S13. ‘Flipped’ weird items: MSA: Homogeneity values (with standard errors and 95%CIs) for all items, sorted by item name.
Items Item H se 95% ci
Ak1 0.230 (0.030) [0.172, 0.288]
Ak2 0.244 (0.027) [0.191, 0.298]
Akt1 0.252 (0.027) [0.199, 0.305]
Akt2 0.252 (0.027) [0.199, 0.305]
At1 0.281 (0.025) [0.233, 0.330]
At2 0.298 (0.027) [0.246, 0.350]
Atk1 0.212 (0.028) [0.158, 0.266]
Atk2 0.233 (0.027) [0.180, 0.286]
Bh1 0.260 (0.024) [0.213, 0.307]
Bh2 0.263 (0.027) [0.209, 0.317]
Bhx1 0.154 (0.029) [0.098, 0.209]
Bhx2 0.147 (0.028) [0.092, 0.203]
Bs1 0.297 (0.022) [0.254, 0.340]
Bs2 0.303 (0.023) [0.258, 0.349]
Bsv1 0.247 (0.026) [0.196, 0.299]
Bsv2 0.237 (0.028) [0.182, 0.292]
Bv1 0.241 (0.027) [0.188, 0.294]
Bv2 0.249 (0.026) [0.197, 0.300]
Bvs1 0.253 (0.025) [0.203, 0.302]
Bvs2 0.262 (0.025) [0.212, 0.312]
Bx1 0.231 (0.028) [0.176, 0.286]
Bx2 0.307 (0.024) [0.261, 0.354]
Bxh1 0.211 (0.028) [0.155, 0.267]
Bxh2 0.133 (0.028) [0.077, 0.189]
Ch1 0.251 (0.025) [0.203, 0.299]
Ch2 0.314 (0.022) [0.272, 0.357]
Chv1 0.167 (0.028) [0.112, 0.222]
Chv2 0.170 (0.028) [0.116, 0.224]
Cv1 0.229 (0.027) [0.176, 0.281]
Cv2 0.246 (0.028) [0.192, 0.301]
Cvh1 0.230 (0.027) [0.177, 0.283]
Cvh2 0.231 (0.027) [0.178, 0.283]
Dc1 0.249 (0.025) [0.200, 0.299]
Dc2 0.274 (0.027) [0.221, 0.326]
Dc3 0.307 (0.024) [0.259, 0.355]
Dc4 0.267 (0.025) [0.217, 0.316]
Dch1 0.075 (0.035) [0.006, 0.143]
Dch2 0.200 (0.037) [0.129, 0.272]
Dcx1 0.193 (0.028) [0.139, 0.247]
Dcx2 0.182 (0.028) [0.127, 0.237]
Dh1 0.221 (0.027) [0.168, 0.273]
Dh2 0.192 (0.031) [0.131, 0.253]
Dhc1 -0.020 (0.043) [-0.104, 0.064]
Dhc2 0.056 (0.042) [-0.026, 0.138]
Dx1 0.233 (0.026) [0.183, 0.284]
Dx2 0.256 (0.026) [0.204, 0.308]
Dxc1 0.225 (0.029) [0.169, 0.281]
Dxc2 0.198 (0.029) [0.141, 0.254]
Ec1 0.263 (0.025) [0.213, 0.313]
Ec2 0.266 (0.027) [0.213, 0.319]
Ec3 0.261 (0.029) [0.204, 0.319]
Ec4 0.297 (0.024) [0.249, 0.345]
Ecl1 0.267 (0.026) [0.217, 0.317]
Ecl2 0.261 (0.025) [0.211, 0.311]
Ecx1 0.192 (0.029) [0.136, 0.248]
Ecx2 0.211 (0.029) [0.155, 0.267]
El1 0.219 (0.030) [0.160, 0.277]
El2 0.250 (0.026) [0.200, 0.300]
El3 0.237 (0.028) [0.183, 0.291]
El4 0.242 (0.028) [0.187, 0.297]
Elc1 0.292 (0.025) [0.243, 0.341]
Elc2 0.252 (0.027) [0.200, 0.305]
Elx1 0.299 (0.026) [0.249, 0.349]
Elx2 0.300 (0.025) [0.250, 0.350]
Ex1 0.244 (0.027) [0.191, 0.297]
Ex2 0.201 (0.026) [0.149, 0.253]
Ex3 0.263 (0.026) [0.213, 0.314]
Ex4 0.241 (0.026) [0.191, 0.292]
Exc1 0.168 (0.029) [0.112, 0.225]
Exc2 0.165 (0.029) [0.109, 0.221]
Exl1 0.293 (0.022) [0.249, 0.337]
Exl2 0.320 (0.024) [0.272, 0.368]
Fs1 0.301 (0.022) [0.257, 0.344]
Fs2 0.325 (0.022) [0.282, 0.369]
Fsv1 0.260 (0.026) [0.209, 0.311]
Fsv2 0.272 (0.027) [0.219, 0.324]
Fv1 0.281 (0.024) [0.234, 0.329]
Fv2 0.291 (0.027) [0.239, 0.344]
Fvs1 0.291 (0.024) [0.244, 0.338]
Fvs2 0.238 (0.026) [0.188, 0.288]
Gt1 0.258 (0.026) [0.207, 0.310]
Gt2 0.307 (0.026) [0.255, 0.359]
Gtx1 0.304 (0.020) [0.265, 0.342]
Gtx2 0.281 (0.025) [0.232, 0.329]
Gx1 0.189 (0.027) [0.136, 0.243]
Gx2 0.253 (0.026) [0.201, 0.304]
Gxt1 0.272 (0.024) [0.225, 0.318]
Gxt2 0.272 (0.026) [0.222, 0.323]
Hs1 0.263 (0.028) [0.209, 0.318]
Hs2 0.309 (0.028) [0.253, 0.364]
Hst1 0.168 (0.028) [0.113, 0.223]
Hst2 0.142 (0.028) [0.088, 0.197]
Ht1 0.267 (0.026) [0.217, 0.318]
Ht2 0.276 (0.026) [0.226, 0.326]
Hts1 0.158 (0.029) [0.101, 0.215]
Hts2 0.150 (0.028) [0.095, 0.206]
Ic1 0.229 (0.026) [0.177, 0.281]
Ic2 0.292 (0.025) [0.244, 0.341]
Icp1 0.188 (0.027) [0.136, 0.241]
Icp2 0.246 (0.027) [0.192, 0.299]
Ik1 0.244 (0.027) [0.192, 0.297]
Ik2 0.259 (0.027) [0.207, 0.312]
Iks1 0.196 (0.027) [0.143, 0.250]
Iks2 0.250 (0.027) [0.197, 0.304]
Ip1 0.165 (0.026) [0.113, 0.216]
Ip2 0.249 (0.025) [0.199, 0.298]
Ip3 0.161 (0.030) [0.102, 0.220]
Ip4 0.248 (0.027) [0.195, 0.302]
Ip5 0.247 (0.027) [0.195, 0.300]
Ip6 0.248 (0.026) [0.197, 0.299]
Ipc1 0.261 (0.024) [0.214, 0.308]
Ipc2 0.277 (0.026) [0.226, 0.329]
Ipv1 0.217 (0.031) [0.156, 0.278]
Ipv2 0.212 (0.031) [0.151, 0.272]
Ipx1 0.265 (0.024) [0.218, 0.312]
Ipx2 0.271 (0.025) [0.221, 0.320]
Is1 0.288 (0.023) [0.242, 0.334]
Is2 0.276 (0.025) [0.227, 0.325]
Is3 0.315 (0.024) [0.268, 0.363]
Is4 0.299 (0.026) [0.248, 0.349]
Isk1 0.223 (0.027) [0.170, 0.275]
Isk2 0.236 (0.026) [0.184, 0.288]
Ist1 0.176 (0.029) [0.120, 0.233]
Ist2 0.176 (0.029) [0.120, 0.232]
It1 0.255 (0.027) [0.202, 0.308]
It2 0.225 (0.028) [0.170, 0.281]
Its1 0.144 (0.029) [0.087, 0.200]
Its2 0.204 (0.028) [0.149, 0.259]
Iv1 0.301 (0.024) [0.255, 0.347]
Iv2 0.262 (0.025) [0.213, 0.312]
Ivp1 0.198 (0.029) [0.141, 0.254]
Ivp2 0.201 (0.030) [0.142, 0.259]
Ix1 0.219 (0.026) [0.168, 0.270]
Ix2 0.259 (0.025) [0.210, 0.309]
Ixp1 0.276 (0.024) [0.229, 0.324]
Ixp2 0.248 (0.027) [0.195, 0.300]
Jk1 0.217 (0.026) [0.167, 0.267]
Jk2 0.271 (0.028) [0.217, 0.326]
Jkx1 0.281 (0.024) [0.234, 0.328]
Jkx2 0.258 (0.025) [0.210, 0.307]
Jx1 0.252 (0.027) [0.199, 0.306]
Jx2 0.230 (0.027) [0.177, 0.283]
Jxk1 0.243 (0.026) [0.192, 0.295]
Jxk2 0.232 (0.027) [0.180, 0.285]
Kk1 0.272 (0.024) [0.226, 0.319]
Kk2 0.323 (0.025) [0.275, 0.372]
Kkl1 -0.063 (0.033) [-0.127, 0.002]
Kkl2 0.054 (0.033) [-0.010, 0.118]
Kl1 0.320 (0.030) [0.260, 0.379]
Kl2 0.334 (0.023) [0.289, 0.378]
Klk1 0.082 (0.030) [0.024, 0.141]
Klk2 0.090 (0.030) [0.031, 0.148]
Lk1 0.269 (0.025) [0.220, 0.317]
Lk2 0.274 (0.026) [0.222, 0.325]
Lkl1 0.182 (0.030) [0.124, 0.240]
Lkl2 0.155 (0.032) [0.091, 0.218]
Ll1 0.270 (0.027) [0.218, 0.323]
Ll2 0.229 (0.027) [0.176, 0.282]
Ll3 0.262 (0.026) [0.212, 0.312]
Ll4 0.217 (0.028) [0.161, 0.272]
Llk1 0.120 (0.034) [0.055, 0.186]
Llk2 0.138 (0.034) [0.071, 0.206]
Llp1 0.123 (0.029) [0.066, 0.181]
Llp2 0.043 (0.029) [-0.013, 0.099]
Lp1 0.228 (0.026) [0.176, 0.279]
Lp2 0.255 (0.028) [0.201, 0.309]
Lp3 0.278 (0.025) [0.229, 0.326]
Lp4 0.300 (0.027) [0.248, 0.353]
Lpl1 0.098 (0.030) [0.040, 0.157]
Lpl2 0.114 (0.030) [0.056, 0.172]
Lpv1 0.161 (0.030) [0.101, 0.221]
Lpv2 0.172 (0.030) [0.113, 0.231]
Lv1 0.225 (0.028) [0.171, 0.280]
Lv2 0.238 (0.026) [0.187, 0.290]
Lvp1 0.169 (0.029) [0.113, 0.226]
Lvp2 0.168 (0.029) [0.111, 0.226]
Mc1 0.122 (0.030) [0.064, 0.180]
Mc2 0.098 (0.030) [0.039, 0.156]
Mc3 0.130 (0.029) [0.073, 0.187]
Mc4 0.030 (0.027) [-0.023, 0.083]
Mch1 0.091 (0.033) [0.026, 0.155]
Mch2 0.179 (0.033) [0.114, 0.244]
Mcl1 0.287 (0.023) [0.242, 0.332]
Mcl2 0.283 (0.024) [0.236, 0.329]
Mh1 0.197 (0.027) [0.144, 0.249]
Mh2 0.235 (0.026) [0.184, 0.287]
Mh3 0.241 (0.025) [0.192, 0.290]
Mh4 0.237 (0.026) [0.185, 0.288]
Mhc1 0.138 (0.035) [0.070, 0.206]
Mhc2 0.172 (0.035) [0.102, 0.241]
Mhl1 0.256 (0.024) [0.208, 0.304]
Mhl2 0.297 (0.024) [0.249, 0.345]
Ml1 0.267 (0.026) [0.217, 0.318]
Ml2 0.239 (0.028) [0.185, 0.293]
Ml3 0.236 (0.027) [0.183, 0.290]
Ml4 0.191 (0.033) [0.127, 0.256]
Mlc1 0.307 (0.024) [0.261, 0.353]
Mlc2 0.273 (0.023) [0.227, 0.318]
Mlh1 0.261 (0.024) [0.213, 0.308]
Mlh2 0.307 (0.023) [0.261, 0.353]
Nk1 0.264 (0.024) [0.217, 0.312]
Nk2 0.269 (0.027) [0.216, 0.321]
Nkp1 0.183 (0.029) [0.126, 0.241]
Nkp2 0.197 (0.030) [0.138, 0.256]
Np1 0.270 (0.025) [0.221, 0.319]
Np2 0.263 (0.033) [0.199, 0.328]
Npk1 0.152 (0.031) [0.091, 0.212]
Npk2 0.169 (0.033) [0.104, 0.234]
Table S14. Same table as bove but sorted by homogeneity.
Items Item H se 95% ci
Kl2 0.334 (0.023) [0.289, 0.378]
Fs2 0.325 (0.022) [0.282, 0.369]
Kk2 0.323 (0.025) [0.275, 0.372]
Exl2 0.320 (0.024) [0.272, 0.368]
Kl1 0.320 (0.030) [0.260, 0.379]
Is3 0.315 (0.024) [0.268, 0.363]
Ch2 0.314 (0.022) [0.272, 0.357]
Hs2 0.309 (0.028) [0.253, 0.364]
Bx2 0.307 (0.024) [0.261, 0.354]
Dc3 0.307 (0.024) [0.259, 0.355]
Gt2 0.307 (0.026) [0.255, 0.359]
Mlc1 0.307 (0.024) [0.261, 0.353]
Mlh2 0.307 (0.023) [0.261, 0.353]
Gtx1 0.304 (0.020) [0.265, 0.342]
Bs2 0.303 (0.023) [0.258, 0.349]
Fs1 0.301 (0.022) [0.257, 0.344]
Iv1 0.301 (0.024) [0.255, 0.347]
Elx2 0.300 (0.025) [0.250, 0.350]
Lp4 0.300 (0.027) [0.248, 0.353]
Elx1 0.299 (0.026) [0.249, 0.349]
Is4 0.299 (0.026) [0.248, 0.349]
At2 0.298 (0.027) [0.246, 0.350]
Bs1 0.297 (0.022) [0.254, 0.340]
Ec4 0.297 (0.024) [0.249, 0.345]
Mhl2 0.297 (0.024) [0.249, 0.345]
Exl1 0.293 (0.022) [0.249, 0.337]
Elc1 0.292 (0.025) [0.243, 0.341]
Ic2 0.292 (0.025) [0.244, 0.341]
Fv2 0.291 (0.027) [0.239, 0.344]
Fvs1 0.291 (0.024) [0.244, 0.338]
Is1 0.288 (0.023) [0.242, 0.334]
Mcl1 0.287 (0.023) [0.242, 0.332]
Mcl2 0.283 (0.024) [0.236, 0.329]
At1 0.281 (0.025) [0.233, 0.330]
Fv1 0.281 (0.024) [0.234, 0.329]
Gtx2 0.281 (0.025) [0.232, 0.329]
Jkx1 0.281 (0.024) [0.234, 0.328]
Lp3 0.278 (0.025) [0.229, 0.326]
Ipc2 0.277 (0.026) [0.226, 0.329]
Ht2 0.276 (0.026) [0.226, 0.326]
Is2 0.276 (0.025) [0.227, 0.325]
Ixp1 0.276 (0.024) [0.229, 0.324]
Dc2 0.274 (0.027) [0.221, 0.326]
Lk2 0.274 (0.026) [0.222, 0.325]
Mlc2 0.273 (0.023) [0.227, 0.318]
Fsv2 0.272 (0.027) [0.219, 0.324]
Gxt1 0.272 (0.024) [0.225, 0.318]
Gxt2 0.272 (0.026) [0.222, 0.323]
Kk1 0.272 (0.024) [0.226, 0.319]
Ipx2 0.271 (0.025) [0.221, 0.320]
Jk2 0.271 (0.028) [0.217, 0.326]
Ll1 0.270 (0.027) [0.218, 0.323]
Np1 0.270 (0.025) [0.221, 0.319]
Lk1 0.269 (0.025) [0.220, 0.317]
Nk2 0.269 (0.027) [0.216, 0.321]
Dc4 0.267 (0.025) [0.217, 0.316]
Ecl1 0.267 (0.026) [0.217, 0.317]
Ht1 0.267 (0.026) [0.217, 0.318]
Ml1 0.267 (0.026) [0.217, 0.318]
Ec2 0.266 (0.027) [0.213, 0.319]
Ipx1 0.265 (0.024) [0.218, 0.312]
Nk1 0.264 (0.024) [0.217, 0.312]
Bh2 0.263 (0.027) [0.209, 0.317]
Ec1 0.263 (0.025) [0.213, 0.313]
Ex3 0.263 (0.026) [0.213, 0.314]
Hs1 0.263 (0.028) [0.209, 0.318]
Np2 0.263 (0.033) [0.199, 0.328]
Bvs2 0.262 (0.025) [0.212, 0.312]
Iv2 0.262 (0.025) [0.213, 0.312]
Ll3 0.262 (0.026) [0.212, 0.312]
Ec3 0.261 (0.029) [0.204, 0.319]
Ecl2 0.261 (0.025) [0.211, 0.311]
Ipc1 0.261 (0.024) [0.214, 0.308]
Mlh1 0.261 (0.024) [0.213, 0.308]
Bh1 0.260 (0.024) [0.213, 0.307]
Fsv1 0.260 (0.026) [0.209, 0.311]
Ik2 0.259 (0.027) [0.207, 0.312]
Ix2 0.259 (0.025) [0.210, 0.309]
Gt1 0.258 (0.026) [0.207, 0.310]
Jkx2 0.258 (0.025) [0.210, 0.307]
Dx2 0.256 (0.026) [0.204, 0.308]
Mhl1 0.256 (0.024) [0.208, 0.304]
It1 0.255 (0.027) [0.202, 0.308]
Lp2 0.255 (0.028) [0.201, 0.309]
Bvs1 0.253 (0.025) [0.203, 0.302]
Gx2 0.253 (0.026) [0.201, 0.304]
Akt1 0.252 (0.027) [0.199, 0.305]
Akt2 0.252 (0.027) [0.199, 0.305]
Elc2 0.252 (0.027) [0.200, 0.305]
Jx1 0.252 (0.027) [0.199, 0.306]
Ch1 0.251 (0.025) [0.203, 0.299]
El2 0.250 (0.026) [0.200, 0.300]
Iks2 0.250 (0.027) [0.197, 0.304]
Bv2 0.249 (0.026) [0.197, 0.300]
Dc1 0.249 (0.025) [0.200, 0.299]
Ip2 0.249 (0.025) [0.199, 0.298]
Ip4 0.248 (0.027) [0.195, 0.302]
Ip6 0.248 (0.026) [0.197, 0.299]
Ixp2 0.248 (0.027) [0.195, 0.300]
Bsv1 0.247 (0.026) [0.196, 0.299]
Ip5 0.247 (0.027) [0.195, 0.300]
Cv2 0.246 (0.028) [0.192, 0.301]
Icp2 0.246 (0.027) [0.192, 0.299]
Ak2 0.244 (0.027) [0.191, 0.298]
Ex1 0.244 (0.027) [0.191, 0.297]
Ik1 0.244 (0.027) [0.192, 0.297]
Jxk1 0.243 (0.026) [0.192, 0.295]
El4 0.242 (0.028) [0.187, 0.297]
Bv1 0.241 (0.027) [0.188, 0.294]
Ex4 0.241 (0.026) [0.191, 0.292]
Mh3 0.241 (0.025) [0.192, 0.290]
Ml2 0.239 (0.028) [0.185, 0.293]
Fvs2 0.238 (0.026) [0.188, 0.288]
Lv2 0.238 (0.026) [0.187, 0.290]
Bsv2 0.237 (0.028) [0.182, 0.292]
El3 0.237 (0.028) [0.183, 0.291]
Mh4 0.237 (0.026) [0.185, 0.288]
Isk2 0.236 (0.026) [0.184, 0.288]
Ml3 0.236 (0.027) [0.183, 0.290]
Mh2 0.235 (0.026) [0.184, 0.287]
Atk2 0.233 (0.027) [0.180, 0.286]
Dx1 0.233 (0.026) [0.183, 0.284]
Jxk2 0.232 (0.027) [0.180, 0.285]
Bx1 0.231 (0.028) [0.176, 0.286]
Cvh2 0.231 (0.027) [0.178, 0.283]
Ak1 0.230 (0.030) [0.172, 0.288]
Cvh1 0.230 (0.027) [0.177, 0.283]
Jx2 0.230 (0.027) [0.177, 0.283]
Cv1 0.229 (0.027) [0.176, 0.281]
Ic1 0.229 (0.026) [0.177, 0.281]
Ll2 0.229 (0.027) [0.176, 0.282]
Lp1 0.228 (0.026) [0.176, 0.279]
Dxc1 0.225 (0.029) [0.169, 0.281]
It2 0.225 (0.028) [0.170, 0.281]
Lv1 0.225 (0.028) [0.171, 0.280]
Isk1 0.223 (0.027) [0.170, 0.275]
Dh1 0.221 (0.027) [0.168, 0.273]
El1 0.219 (0.030) [0.160, 0.277]
Ix1 0.219 (0.026) [0.168, 0.270]
Ipv1 0.217 (0.031) [0.156, 0.278]
Jk1 0.217 (0.026) [0.167, 0.267]
Ll4 0.217 (0.028) [0.161, 0.272]
Atk1 0.212 (0.028) [0.158, 0.266]
Ipv2 0.212 (0.031) [0.151, 0.272]
Bxh1 0.211 (0.028) [0.155, 0.267]
Ecx2 0.211 (0.029) [0.155, 0.267]
Its2 0.204 (0.028) [0.149, 0.259]
Ex2 0.201 (0.026) [0.149, 0.253]
Ivp2 0.201 (0.030) [0.142, 0.259]
Dch2 0.200 (0.037) [0.129, 0.272]
Dxc2 0.198 (0.029) [0.141, 0.254]
Ivp1 0.198 (0.029) [0.141, 0.254]
Mh1 0.197 (0.027) [0.144, 0.249]
Nkp2 0.197 (0.030) [0.138, 0.256]
Iks1 0.196 (0.027) [0.143, 0.250]
Dcx1 0.193 (0.028) [0.139, 0.247]
Dh2 0.192 (0.031) [0.131, 0.253]
Ecx1 0.192 (0.029) [0.136, 0.248]
Ml4 0.191 (0.033) [0.127, 0.256]
Gx1 0.189 (0.027) [0.136, 0.243]
Icp1 0.188 (0.027) [0.136, 0.241]
Nkp1 0.183 (0.029) [0.126, 0.241]
Dcx2 0.182 (0.028) [0.127, 0.237]
Lkl1 0.182 (0.030) [0.124, 0.240]
Mch2 0.179 (0.033) [0.114, 0.244]
Ist1 0.176 (0.029) [0.120, 0.233]
Ist2 0.176 (0.029) [0.120, 0.232]
Lpv2 0.172 (0.030) [0.113, 0.231]
Mhc2 0.172 (0.035) [0.102, 0.241]
Chv2 0.170 (0.028) [0.116, 0.224]
Lvp1 0.169 (0.029) [0.113, 0.226]
Npk2 0.169 (0.033) [0.104, 0.234]
Exc1 0.168 (0.029) [0.112, 0.225]
Hst1 0.168 (0.028) [0.113, 0.223]
Lvp2 0.168 (0.029) [0.111, 0.226]
Chv1 0.167 (0.028) [0.112, 0.222]
Exc2 0.165 (0.029) [0.109, 0.221]
Ip1 0.165 (0.026) [0.113, 0.216]
Ip3 0.161 (0.030) [0.102, 0.220]
Lpv1 0.161 (0.030) [0.101, 0.221]
Hts1 0.158 (0.029) [0.101, 0.215]
Lkl2 0.155 (0.032) [0.091, 0.218]
Bhx1 0.154 (0.029) [0.098, 0.209]
Npk1 0.152 (0.031) [0.091, 0.212]
Hts2 0.150 (0.028) [0.095, 0.206]
Bhx2 0.147 (0.028) [0.092, 0.203]
Its1 0.144 (0.029) [0.087, 0.200]
Hst2 0.142 (0.028) [0.088, 0.197]
Llk2 0.138 (0.034) [0.071, 0.206]
Mhc1 0.138 (0.035) [0.070, 0.206]
Bxh2 0.133 (0.028) [0.077, 0.189]
Mc3 0.130 (0.029) [0.073, 0.187]
Llp1 0.123 (0.029) [0.066, 0.181]
Mc1 0.122 (0.030) [0.064, 0.180]
Llk1 0.120 (0.034) [0.055, 0.186]
Lpl2 0.114 (0.030) [0.056, 0.172]
Lpl1 0.098 (0.030) [0.040, 0.157]
Mc2 0.098 (0.030) [0.039, 0.156]
Mch1 0.091 (0.033) [0.026, 0.155]
Klk2 0.090 (0.030) [0.031, 0.148]
Klk1 0.082 (0.030) [0.024, 0.141]
Dch1 0.075 (0.035) [0.006, 0.143]
Dhc2 0.056 (0.042) [-0.026, 0.138]
Kkl2 0.054 (0.033) [-0.010, 0.118]
Llp2 0.043 (0.029) [-0.013, 0.099]
Mc4 0.030 (0.027) [-0.023, 0.083]
Dhc1 -0.020 (0.043) [-0.104, 0.064]
Kkl1 -0.063 (0.033) [-0.127, 0.002]

The complete item set has a homogeneity value H (se, 95%CI) of 0.223, (0.008), [0.207, 0.238]: this is significantly lower than the recommended 0.30, suggesting that the scale is not homogeneous. This is further supported by the fact that few items have a homogeneity around or above this value (only 19 if we consider the point estimate, and 93 out of 208 if we consider a 95%CI with an upper limit above 0.3). Interestingly, the homogeneity of related items (different presentations and different orders of the tones) are overall very similar, suggesting again that this an intrinsic property of the segments and tone(s) and not of their repeated presentation of order of tones.

Table S15. ‘Flipped’ weird items: MSA: aisp for increasing H thresholds (c) for all items.
Items c=0.05 c=0.10 c=0.15 c=0.20 c=0.25 c=0.30 c=0.35 c=0.40 c=0.45 c=0.50 c=0.55 c=0.60
Ak1 1 1 1 1 1 9 25 0 0 0 0 0
Ak2 1 1 1 1 1 1 7 15 23 0 0 0
Akt1 1 1 1 1 1 2 2 12 21 29 0 0
Akt2 1 1 1 1 1 2 2 47 0 0 0 0
At1 1 1 1 1 1 1 1 0 0 0 0 0
At2 1 1 1 1 1 1 1 7 12 17 0 0
Atk1 1 1 1 1 1 2 2 43 0 0 0 0
Atk2 1 1 1 1 1 2 2 0 0 0 0 0
Bh1 1 1 1 1 1 1 1 8 0 0 0 0
Bh2 1 1 1 1 1 1 11 24 36 0 0 0
Bhx1 2 2 2 7 5 6 5 40 0 0 0 0
Bhx2 2 2 2 7 4 5 12 25 39 0 0 0
Bs1 1 1 1 1 1 1 1 1 1 0 0 0
Bs2 1 1 1 1 1 1 1 1 1 14 14 0
Bsv1 1 1 1 1 1 2 4 18 31 0 0 0
Bsv2 1 1 1 1 1 2 2 14 20 0 0 0
Bv1 1 1 1 1 1 1 5 10 15 23 0 0
Bv2 1 1 1 1 1 1 17 35 0 0 0 0
Bvs1 1 1 1 1 1 2 2 23 35 0 0 0
Bvs2 1 1 1 1 1 2 2 14 20 27 0 0
Bx1 1 1 1 1 1 1 11 24 42 0 0 0
Bx2 1 1 1 1 1 1 1 1 1 1 1 1
Bxh1 1 1 1 1 1 1 11 26 41 0 0 0
Bxh2 0 0 6 0 0 0 17 0 0 0 0 0
Ch1 1 1 1 1 1 1 5 33 0 0 0 0
Ch2 1 1 1 1 1 1 1 1 9 12 12 0
Chv1 1 1 1 3 9 14 0 0 0 0 0 0
Chv2 3 3 3 3 9 0 13 0 0 0 0 0
Cv1 1 1 1 1 1 12 30 0 0 0 0 0
Cv2 1 1 1 1 1 1 16 32 0 0 0 0
Cvh1 1 1 1 1 1 2 21 46 0 0 0 0
Cvh2 1 1 1 1 1 2 13 28 0 0 0 0
Dc1 1 1 1 1 1 0 30 0 0 0 0 0
Dc2 1 1 1 1 1 1 1 8 11 19 0 0
Dc3 1 1 1 1 1 1 1 1 28 0 0 0
Dc4 1 1 1 1 1 1 5 36 0 0 0 0
Dch1 3 3 3 5 0 16 0 0 0 0 0 0
Dch2 3 3 3 3 3 2 14 6 20 27 0 0
Dcx1 1 1 1 1 3 2 14 29 0 0 0 0
Dcx2 3 3 3 3 3 8 14 29 0 0 0 0
Dh1 1 1 1 1 1 6 0 0 40 0 0 0
Dh2 2 2 2 2 2 3 3 3 4 6 6 0
Dhc1 3 3 5 5 0 0 0 0 0 0 0 0
Dhc2 0 0 5 5 7 16 0 0 0 0 0 0
Dx1 1 1 1 1 1 1 20 41 16 24 0 0
Dx2 1 1 1 1 1 1 22 11 28 0 0 0
Dxc1 3 3 3 3 3 2 23 20 6 8 8 0
Dxc2 1 1 1 1 0 2 0 0 0 0 0 0
Ec1 1 1 1 1 1 1 10 27 0 0 0 0
Ec2 1 1 1 1 1 1 1 8 33 0 0 0
Ec3 1 1 1 1 1 1 8 4 27 0 0 0
Ec4 1 1 1 1 1 1 1 1 1 4 4 4
Ecl1 1 1 1 1 1 2 2 31 0 0 0 0
Ecl2 1 1 1 1 1 2 2 23 0 0 0 0
Ecx1 3 3 3 3 14 0 0 38 24 0 0 0
Ecx2 1 1 1 1 1 2 14 0 17 10 10 0
El1 1 1 1 1 4 13 20 41 0 0 0 0
El2 1 1 1 1 1 1 10 0 0 0 0 0
El3 1 1 1 1 1 1 26 36 0 0 0 0
El4 1 1 1 1 1 1 5 36 0 0 0 0
Elc1 1 1 1 1 1 2 2 2 7 0 0 0
Elc2 1 1 1 1 1 2 2 2 37 0 0 0
Elx1 1 1 1 1 1 1 1 2 3 5 0 0
Elx2 1 1 1 1 1 1 2 2 3 5 5 0
Ex1 1 1 1 1 1 1 1 8 13 4 4 4
Ex2 1 1 1 1 5 6 0 26 41 0 0 0
Ex3 1 1 1 1 1 1 1 1 1 2 2 2
Ex4 1 1 1 1 1 1 11 24 36 0 0 0
Exc1 3 3 3 3 14 0 0 0 0 0 0 0
Exc2 3 3 3 3 3 8 4 18 31 0 0 0
Exl1 1 1 1 1 1 2 2 2 3 28 0 0
Exl2 1 1 1 1 1 1 1 2 3 20 0 0
Fs1 1 1 1 1 1 1 1 1 16 24 0 0
Fs2 1 1 1 1 1 1 1 1 10 0 0 0
Fsv1 1 1 1 1 1 2 2 22 34 0 0 0
Fsv2 1 1 1 1 1 2 2 2 37 0 0 0
Fv1 1 1 1 1 1 1 1 7 9 17 0 0
Fv2 1 1 1 1 1 1 2 31 8 11 11 0
Fvs1 1 1 1 1 1 2 2 2 3 0 0 0
Fvs2 1 1 1 1 1 2 2 12 22 30 0 0
Gt1 1 1 1 1 1 1 5 10 40 0 0 0
Gt2 1 1 1 1 1 1 1 1 1 14 14 0
Gtx1 1 1 1 1 1 2 2 2 3 5 5 0
Gtx2 1 1 1 1 1 2 2 2 2 3 3 3
Gx1 1 1 1 1 5 0 6 13 26 31 0 0
Gx2 1 1 1 1 1 1 1 8 11 19 0 0
Gxt1 1 1 1 1 1 2 2 2 2 10 10 0
Gxt2 1 1 1 1 1 2 2 12 21 29 0 0
Hs1 1 1 1 1 1 1 6 13 13 31 0 0
Hs2 1 1 1 1 1 1 1 1 10 13 13 0
Hst1 2 2 2 4 12 0 0 0 0 0 0 0
Hst2 2 2 2 4 12 20 0 0 0 0 0 0
Ht1 1 1 1 1 1 1 11 34 0 0 0 0
Ht2 1 1 1 1 1 1 10 4 5 7 7 0
Hts1 2 2 2 0 12 9 25 0 0 0 0 0
Hts2 2 2 6 4 12 20 0 0 0 0 0 0
Ic1 1 1 1 1 1 1 0 0 33 0 0 0
Ic2 1 1 1 1 1 1 1 1 1 0 0 0
Icp1 3 3 3 3 3 17 15 30 0 0 0 0
Icp2 1 1 1 1 1 2 23 0 0 0 0 0
Ik1 1 1 1 1 1 18 0 0 0 0 0 0
Ik2 1 1 1 1 1 1 7 33 0 0 0 0
Iks1 3 3 3 3 3 17 0 0 0 0 0 0
Iks2 1 1 1 1 1 2 2 2 0 0 0 0
Ip1 1 1 1 8 13 0 0 0 0 0 0 0
Ip2 1 1 1 1 1 1 1 45 0 0 0 0
Ip3 2 2 4 6 0 0 0 0 0 0 0 0
Ip4 1 1 1 1 1 1 6 7 26 0 0 0
Ip5 1 1 1 1 1 1 8 42 0 0 0 0
Ip6 1 1 1 1 1 1 16 32 0 0 0 0
Ipc1 1 1 1 1 1 2 2 12 22 30 0 0
Ipc2 1 1 1 1 1 2 2 2 2 18 0 0
Ipv1 1 1 1 1 1 2 2 5 7 9 9 0
Ipv2 1 1 1 1 1 2 2 2 2 3 3 3
Ipx1 1 1 1 1 1 2 2 5 7 9 9 0
Ipx2 1 1 1 1 1 2 2 20 0 0 0 0
Is1 1 1 1 1 1 1 1 6 9 15 15 0
Is2 1 1 1 1 1 1 9 0 0 0 0 0
Is3 1 1 1 1 1 1 1 4 5 7 7 0
Is4 1 1 1 1 1 1 1 6 9 15 15 0
Isk1 1 1 1 1 1 14 19 39 0 0 0 0
Isk2 1 1 1 1 1 2 23 43 0 0 0 0
Ist1 2 2 7 0 11 18 10 0 0 0 0 0
Ist2 2 2 2 7 4 19 0 42 0 0 0 0
It1 1 1 1 1 1 1 7 15 23 0 0 0
It2 1 1 1 1 1 0 0 0 0 0 0 0
Its1 2 2 6 0 0 0 0 0 0 0 0 0
Its2 1 1 1 1 4 13 8 27 0 0 0 0
Iv1 1 1 1 1 1 1 1 1 30 16 16 0
Iv2 1 1 1 1 1 1 8 21 27 0 0 0
Ivp1 1 1 1 1 1 2 15 30 0 0 0 0
Ivp2 1 1 1 1 1 2 2 23 35 0 0 0
Ix1 1 1 1 1 4 5 12 25 39 0 0 0
Ix2 1 1 1 1 1 6 17 35 0 20 0 0
Ixp1 1 1 1 1 1 2 2 5 17 0 0 0
Ixp2 1 1 1 1 1 2 2 17 29 0 0 0
Jk1 1 1 1 1 11 0 7 15 0 0 0 0
Jk2 1 1 1 1 1 1 24 0 0 0 0 0
Jkx1 1 1 1 1 1 2 2 2 24 0 0 0
Jkx2 1 1 1 1 1 2 2 38 0 0 0 0
Jx1 1 1 1 1 1 1 1 4 5 16 16 0
Jx2 1 1 1 1 1 19 0 0 0 0 0 0
Jxk1 1 1 1 1 1 2 4 9 14 22 0 0
Jxk2 1 1 1 1 1 2 2 43 0 0 0 0
Kk1 1 1 1 1 1 1 1 8 0 0 0 0
Kk2 1 1 1 1 1 1 1 1 1 2 2 2
Kkl1 0 0 7 4 8 15 0 0 0 0 0 0
Kkl2 2 2 2 4 8 15 26 0 0 0 0 0
Kl1 1 1 1 1 1 1 1 1 9 12 12 0
Kl2 1 1 1 1 1 1 1 1 1 1 1 1
Klk1 2 2 7 4 15 12 28 0 0 0 0 0
Klk2 2 2 2 4 15 0 24 44 0 0 0 0
Lk1 1 1 1 1 1 1 10 4 30 0 0 0
Lk2 1 1 1 1 1 1 1 1 10 13 13 0
Lkl1 1 1 1 1 6 10 29 0 25 0 0 0
Lkl2 3 3 3 3 6 10 29 0 0 0 0 0
Ll1 1 1 1 1 1 1 1 11 19 26 0 0
Ll2 1 1 1 1 1 1 8 7 0 0 0 0
Ll3 1 1 1 1 1 1 1 21 27 0 0 0
Ll4 1 1 1 1 4 12 28 0 0 0 0 0
Llk1 3 3 3 0 6 0 0 0 0 0 0 0
Llk2 3 3 3 3 6 10 0 47 0 0 0 0
Llp1 2 2 4 6 10 11 22 0 0 0 0 0
Llp2 2 2 4 6 10 0 0 0 0 0 0 0
Lp1 1 1 1 1 1 19 0 0 0 0 0 0
Lp2 1 1 1 1 1 1 1 16 0 0 0 0
Lp3 1 1 1 1 1 1 1 16 12 25 0 0
Lp4 1 1 1 1 1 1 1 1 8 12 0 0
Lpl1 2 2 4 8 13 0 0 34 0 0 0 0
Lpl2 2 2 4 6 0 0 0 0 0 0 0 0
Lpv1 3 3 3 3 3 10 27 0 0 0 0 0
Lpv2 3 3 3 3 3 2 27 17 29 0 0 0
Lv1 1 1 1 1 1 1 5 10 15 23 0 0
Lv2 1 1 1 1 1 1 5 40 0 0 0 0
Lvp1 3 3 3 3 3 10 27 0 0 0 0 0
Lvp2 3 3 3 3 3 2 13 28 0 0 0 0
Mc1 2 2 2 2 2 3 0 0 0 0 0 0
Mc2 2 2 2 2 2 3 3 3 4 6 6 0
Mc3 2 2 2 2 2 7 0 0 0 0 0 0
Mc4 0 0 2 2 0 0 0 0 0 0 0 0
Mch1 3 3 3 9 7 0 0 0 0 0 0 0
Mch2 3 3 3 3 7 8 4 9 14 22 0 0
Mcl1 1 1 1 1 1 2 2 2 2 21 0 0
Mcl2 1 1 1 1 1 2 2 2 2 18 0 0
Mh1 1 1 1 1 2 7 18 37 0 0 0 0
Mh2 1 1 1 1 1 1 7 0 0 0 0 0
Mh3 1 1 1 1 1 7 18 37 0 0 0 0
Mh4 1 1 1 1 1 5 18 45 0 0 0 0
Mhc1 3 3 3 9 7 0 0 0 38 0 0 0
Mhc2 3 3 3 3 7 8 0 22 34 28 0 0
Mhl1 1 1 1 1 1 2 2 14 25 0 0 0
Mhl2 1 1 1 1 1 2 2 2 2 18 0 0
Ml1 1 1 1 1 1 1 1 1 12 25 0 0
Ml2 1 1 1 1 1 1 7 44 42 0 0 0
Ml3 1 1 1 1 1 3 17 0 0 0 0 0
Ml4 1 1 1 1 4 11 0 0 0 0 0 0
Mlc1 1 1 1 1 1 1 1 2 8 11 11 0
Mlc2 1 1 1 1 1 2 2 2 6 8 8 0
Mlh1 1 1 1 1 1 2 2 20 38 0 0 0
Mlh2 1 1 1 1 1 1 2 2 18 21 0 0
Nk1 1 1 1 1 1 1 6 0 0 0 0 0
Nk2 1 1 1 1 1 1 19 39 0 0 0 0
Nkp1 1 1 1 1 3 0 0 0 18 0 0 0
Nkp2 3 3 3 3 3 2 21 46 0 0 0 0
Np1 1 1 1 1 1 1 1 1 16 2 0 0
Np2 1 1 1 1 1 1 8 11 19 26 0 0
Npk1 3 3 3 3 3 4 9 19 32 0 0 0
Npk2 3 3 3 3 3 4 9 19 32 0 0 0

The results are slightly better, but still far from ideal for IRT…

2.4.4.5 Conclusions

So, it seems that the “weird” stimuli Bhx, Hst, Ist, Kkl1 and Llp (all presentations and variants) are perceived by the participants as (rather difficult) ‘same’-type items and not as the intended ‘different’-type items. With this change, the items seem to fall into the two natural classes ‘same’ vs ‘different’ (even if there is a lot of unaccounted variation).

2.4.5 Let’s reduce the items set

The MSA suggests that there are too many items, and it seems clear that the successive presentations of the same item do not seem to make a difference and, for the (by design) ‘different’ items, the order of the tones does not seem to mater as well. If this is so, we can reduce the items set by:

  • keeping systematically the 2nd presentation,
  • and, for the ‘different’ items, only one order of tones (say, the 1st in alphabetical order).

With these:

Table S16. ‘Flipped’ weird items: MSA: Homogeneity values (with standard errors and 95%CIs) for all items, sorted by item name.
Items Item H se 95% ci
Ak1 0.240 (0.029) [0.182, 0.297]
Akt1 0.237 (0.029) [0.180, 0.293]
At1 0.276 (0.025) [0.228, 0.324]
Bh1 0.273 (0.025) [0.224, 0.322]
Bhx1 0.175 (0.031) [0.114, 0.236]
Bs1 0.305 (0.022) [0.262, 0.347]
Bsv1 0.239 (0.027) [0.185, 0.293]
Bv1 0.257 (0.029) [0.201, 0.313]
Bx1 0.248 (0.030) [0.190, 0.306]
Ch1 0.272 (0.025) [0.223, 0.321]
Chv1 0.153 (0.030) [0.095, 0.211]
Cv1 0.243 (0.026) [0.193, 0.294]
Dc1 0.257 (0.026) [0.207, 0.307]
Dch1 0.062 (0.039) [-0.015, 0.138]
Dcx1 0.173 (0.029) [0.117, 0.229]
Dh1 0.230 (0.027) [0.176, 0.284]
Dx1 0.251 (0.028) [0.196, 0.306]
Ec1 0.274 (0.027) [0.221, 0.328]
Ecl1 0.254 (0.027) [0.200, 0.307]
Ecx1 0.185 (0.031) [0.125, 0.246]
El1 0.231 (0.031) [0.169, 0.292]
Elx1 0.281 (0.028) [0.227, 0.335]
Ex1 0.263 (0.029) [0.207, 0.319]
Fs1 0.321 (0.024) [0.275, 0.368]
Fsv1 0.236 (0.027) [0.184, 0.288]
Fv1 0.295 (0.025) [0.246, 0.344]
Gt1 0.276 (0.027) [0.222, 0.329]
Gtx1 0.290 (0.021) [0.248, 0.331]
Gx1 0.205 (0.030) [0.147, 0.263]
Hs1 0.300 (0.028) [0.245, 0.354]
Hst1 0.185 (0.029) [0.127, 0.242]
Ht1 0.272 (0.027) [0.219, 0.324]
Ic1 0.240 (0.028) [0.186, 0.295]
Icp1 0.184 (0.028) [0.129, 0.239]
Ik1 0.257 (0.030) [0.197, 0.317]
Iks1 0.191 (0.029) [0.136, 0.247]
Ip1 0.175 (0.027) [0.121, 0.228]
Ipv1 0.211 (0.035) [0.143, 0.279]
Ipx1 0.249 (0.025) [0.200, 0.298]
Is1 0.292 (0.024) [0.246, 0.339]
Ist1 0.199 (0.031) [0.139, 0.259]
It1 0.253 (0.029) [0.196, 0.310]
Iv1 0.309 (0.024) [0.261, 0.357]
Ix1 0.231 (0.027) [0.179, 0.283]
Jk1 0.220 (0.027) [0.168, 0.273]
Jkx1 0.265 (0.025) [0.215, 0.314]
Jx1 0.272 (0.029) [0.214, 0.329]
Kk1 0.285 (0.027) [0.232, 0.337]
Kkl1 -0.033 (0.038) [-0.108, 0.041]
Kl1 0.327 (0.033) [0.263, 0.392]
Lk1 0.272 (0.025) [0.223, 0.322]
Lkl1 0.174 (0.032) [0.111, 0.237]
Ll1 0.277 (0.028) [0.222, 0.332]
Llp1 0.132 (0.032) [0.069, 0.195]
Lp1 0.245 (0.027) [0.192, 0.298]
Lpv1 0.157 (0.034) [0.091, 0.224]
Lv1 0.247 (0.029) [0.190, 0.303]
Mc1 0.145 (0.033) [0.080, 0.210]
Mch1 0.071 (0.037) [-0.002, 0.143]
Mcl1 0.277 (0.023) [0.232, 0.323]
Mh1 0.212 (0.028) [0.156, 0.267]
Mhl1 0.244 (0.025) [0.195, 0.293]
Ml1 0.270 (0.028) [0.216, 0.324]
Nk1 0.274 (0.026) [0.223, 0.325]
Nkp1 0.189 (0.032) [0.126, 0.251]
Np1 0.281 (0.025) [0.231, 0.331]
Table S17. Same table as bove but sorted by homogeneity.
Items Item H se 95% ci
Kl1 0.327 (0.033) [0.263, 0.392]
Fs1 0.321 (0.024) [0.275, 0.368]
Iv1 0.309 (0.024) [0.261, 0.357]
Bs1 0.305 (0.022) [0.262, 0.347]
Hs1 0.300 (0.028) [0.245, 0.354]
Fv1 0.295 (0.025) [0.246, 0.344]
Is1 0.292 (0.024) [0.246, 0.339]
Gtx1 0.290 (0.021) [0.248, 0.331]
Kk1 0.285 (0.027) [0.232, 0.337]
Elx1 0.281 (0.028) [0.227, 0.335]
Np1 0.281 (0.025) [0.231, 0.331]
Ll1 0.277 (0.028) [0.222, 0.332]
Mcl1 0.277 (0.023) [0.232, 0.323]
At1 0.276 (0.025) [0.228, 0.324]
Gt1 0.276 (0.027) [0.222, 0.329]
Ec1 0.274 (0.027) [0.221, 0.328]
Nk1 0.274 (0.026) [0.223, 0.325]
Bh1 0.273 (0.025) [0.224, 0.322]
Ch1 0.272 (0.025) [0.223, 0.321]
Ht1 0.272 (0.027) [0.219, 0.324]
Jx1 0.272 (0.029) [0.214, 0.329]
Lk1 0.272 (0.025) [0.223, 0.322]
Ml1 0.270 (0.028) [0.216, 0.324]
Jkx1 0.265 (0.025) [0.215, 0.314]
Ex1 0.263 (0.029) [0.207, 0.319]
Bv1 0.257 (0.029) [0.201, 0.313]
Dc1 0.257 (0.026) [0.207, 0.307]
Ik1 0.257 (0.030) [0.197, 0.317]
Ecl1 0.254 (0.027) [0.200, 0.307]
It1 0.253 (0.029) [0.196, 0.310]
Dx1 0.251 (0.028) [0.196, 0.306]
Ipx1 0.249 (0.025) [0.200, 0.298]
Bx1 0.248 (0.030) [0.190, 0.306]
Lv1 0.247 (0.029) [0.190, 0.303]
Lp1 0.245 (0.027) [0.192, 0.298]
Mhl1 0.244 (0.025) [0.195, 0.293]
Cv1 0.243 (0.026) [0.193, 0.294]
Ak1 0.240 (0.029) [0.182, 0.297]
Ic1 0.240 (0.028) [0.186, 0.295]
Bsv1 0.239 (0.027) [0.185, 0.293]
Akt1 0.237 (0.029) [0.180, 0.293]
Fsv1 0.236 (0.027) [0.184, 0.288]
El1 0.231 (0.031) [0.169, 0.292]
Ix1 0.231 (0.027) [0.179, 0.283]
Dh1 0.230 (0.027) [0.176, 0.284]
Jk1 0.220 (0.027) [0.168, 0.273]
Mh1 0.212 (0.028) [0.156, 0.267]
Ipv1 0.211 (0.035) [0.143, 0.279]
Gx1 0.205 (0.030) [0.147, 0.263]
Ist1 0.199 (0.031) [0.139, 0.259]
Iks1 0.191 (0.029) [0.136, 0.247]
Nkp1 0.189 (0.032) [0.126, 0.251]
Ecx1 0.185 (0.031) [0.125, 0.246]
Hst1 0.185 (0.029) [0.127, 0.242]
Icp1 0.184 (0.028) [0.129, 0.239]
Bhx1 0.175 (0.031) [0.114, 0.236]
Ip1 0.175 (0.027) [0.121, 0.228]
Lkl1 0.174 (0.032) [0.111, 0.237]
Dcx1 0.173 (0.029) [0.117, 0.229]
Lpv1 0.157 (0.034) [0.091, 0.224]
Chv1 0.153 (0.030) [0.095, 0.211]
Mc1 0.145 (0.033) [0.080, 0.210]
Llp1 0.132 (0.032) [0.069, 0.195]
Mch1 0.071 (0.037) [-0.002, 0.143]
Dch1 0.062 (0.039) [-0.015, 0.138]
Kkl1 -0.033 (0.038) [-0.108, 0.041]

The complete item set has a homogeneity value H (se, 95%CI) of 0.230, (0.010), [0.210, 0.249]: this is significantly lower than the recommended 0.30, suggesting that the scale is not homogeneous. This is further supported by the fact that few items have a homogeneity around or above this value (only 5 if we consider the point estimate, and 33 out of 66 if we consider a 95%CI with an upper limit above 0.3). Interestingly, the homogeneity of related items (different presentations and different orders of the tones) are overall very similar, suggesting again that this an intrinsic property of the segments and tone(s) and not of their repeated presentation of order of tones.

Table S18. ‘Flipped’ weird items: MSA: aisp for increasing H thresholds (c) for all items.
Items c=0.05 c=0.10 c=0.15 c=0.20 c=0.25 c=0.30 c=0.35 c=0.40 c=0.45 c=0.50 c=0.55 c=0.60
Ak1 1 1 1 1 1 3 13 0 0 0 0 NA
Akt1 1 1 1 1 2 2 2 13 0 0 0 NA
At1 1 1 1 1 1 1 11 0 0 0 0 NA
Bh1 1 1 1 1 1 1 4 3 0 0 0 NA
Bhx1 1 1 1 1 1 7 0 0 0 0 0 NA
Bs1 1 1 1 1 1 1 1 5 6 0 0 NA
Bsv1 1 1 1 1 2 2 12 0 0 0 0 NA
Bv1 1 1 1 1 1 1 3 5 6 6 0 NA
Bx1 1 1 1 1 1 1 8 10 0 0 0 NA
Ch1 1 1 1 1 1 1 3 11 0 0 0 NA
Chv1 1 1 1 0 0 0 0 0 0 0 0 NA
Cv1 1 1 1 1 1 7 11 11 0 0 0 NA
Dc1 1 1 1 1 1 1 5 0 0 0 0 NA
Dch1 2 2 2 2 0 4 10 0 0 0 0 NA
Dcx1 2 2 2 2 2 4 10 0 0 0 0 NA
Dh1 1 1 1 1 1 5 0 0 12 0 0 NA
Dx1 1 1 1 1 1 1 9 6 7 7 0 NA
Ec1 1 1 1 1 1 1 6 10 0 0 0 NA
Ecl1 1 1 1 1 1 2 2 13 0 0 0 NA
Ecx1 2 2 2 2 0 0 12 0 9 0 0 NA
El1 1 1 1 1 1 5 9 0 0 0 0 NA
Elx1 1 1 1 1 1 1 1 1 1 1 1 NA
Ex1 1 1 1 1 1 1 4 4 5 5 0 NA
Fs1 1 1 1 1 1 1 1 6 7 7 0 NA
Fsv1 1 1 1 1 2 2 2 0 0 0 0 NA
Fv1 1 1 1 1 1 1 1 12 3 3 3 NA
Gt1 1 1 1 1 1 1 3 5 12 0 0 NA
Gtx1 1 1 1 1 1 1 1 1 1 1 1 NA
Gx1 1 1 1 1 1 0 0 0 0 0 0 NA
Hs1 1 1 1 1 1 1 1 4 5 5 0 NA
Hst1 3 3 3 4 0 0 0 0 0 0 0 NA
Ht1 1 1 1 1 1 1 5 7 8 0 0 NA
Ic1 1 1 1 1 1 3 13 0 0 0 0 NA
Icp1 1 1 1 2 2 0 0 0 0 0 0 NA
Ik1 1 1 1 1 1 1 0 0 0 0 0 NA
Iks1 1 1 1 1 2 2 0 0 0 0 0 NA
Ip1 1 1 1 3 0 0 0 0 0 0 0 NA
Ipv1 1 1 1 1 2 2 2 2 2 2 2 NA
Ipx1 1 1 1 1 2 2 2 2 2 2 2 NA
Is1 1 1 1 1 1 1 1 12 3 0 0 NA
Ist1 1 1 1 1 3 3 0 0 0 0 0 NA
It1 1 1 1 1 1 3 6 8 10 0 0 NA
Iv1 1 1 1 1 1 1 1 3 4 4 4 NA
Ix1 1 1 1 1 1 0 8 10 0 0 0 NA
Jk1 1 1 1 1 3 3 6 8 10 0 0 NA
Jkx1 1 1 1 1 1 1 1 1 9 0 0 NA
Jx1 1 1 1 1 1 1 1 3 4 4 4 NA
Kk1 1 1 1 1 1 1 1 3 0 0 0 NA
Kkl1 3 3 4 0 0 0 0 0 0 0 0 NA
Kl1 1 1 1 1 1 1 1 1 3 3 3 NA
Lk1 1 1 1 1 1 1 3 0 0 0 0 NA
Lkl1 1 1 1 2 2 6 7 9 11 0 0 NA
Ll1 1 1 1 1 1 1 5 7 8 0 0 NA
Llp1 3 3 3 4 0 0 0 0 0 0 0 NA
Lp1 1 1 1 1 1 0 4 4 0 0 0 NA
Lpv1 2 2 2 2 2 6 0 0 0 0 0 NA
Lv1 1 1 1 1 1 1 3 5 6 6 0 NA
Mc1 3 3 4 3 0 0 0 0 0 0 0 NA
Mch1 2 2 0 0 0 0 0 0 0 0 0 NA
Mcl1 1 1 1 1 1 2 2 2 0 0 0 NA
Mh1 1 1 1 1 1 0 0 0 0 0 0 NA
Mhl1 1 1 1 1 2 2 7 9 11 0 0 NA
Ml1 1 1 1 1 1 1 1 3 0 0 0 NA
Nk1 1 1 1 1 1 1 1 6 0 0 0 NA
Nkp1 1 1 1 1 2 2 0 0 0 0 0 NA
Np1 1 1 1 1 1 1 1 6 7 0 0 NA

Let’s iteratively remove the unscalable items for c ≤ 0.30:

After removing 21 items (Mch1, Chv1, Kkl1, Mc1, Ip1, Dch1, Ecx1, Hst1, Llp1, Dcx1, Lpv1, Gx1, Jk1, Icp1, Lkl1, Ix1, Lp1, Mh1, Ist1, It1, Iks1), we are left with the 45 items (Ak1, Akt1, At1, Bh1, Bhx1, Bs1, Bsv1, Bv1, Bx1, Ch1, Cv1, Dc1, Dh1, Dx1, Ec1, Ecl1, El1, Elx1, Ex1, Fs1, Fsv1, Fv1, Gt1, Gtx1, Hs1, Ht1, Ic1, Ik1, Ipv1, Ipx1, Is1, Iv1, Jkx1, Jx1, Kk1, Kl1, Lk1, Ll1, Lv1, Mcl1, Mhl1, Ml1, Nk1, Nkp1, Np1), covering both ‘same’ and ‘different’ items, which seem to form a single scale (more or less, especially at c = 0.30):

Table S19. MSA: aisp for increasing H thresholds (c) after retaning only 1 item per class and after removing the unscalable itmes.
Items c=0.05 c=0.10 c=0.15 c=0.20 c=0.25 c=0.30
Ak1 1 1 1 1 1 3
Akt1 1 1 1 1 1 2
At1 1 1 1 1 1 1
Bh1 1 1 1 1 1 1
Bhx1 1 1 1 1 1 5
Bs1 1 1 1 1 1 1
Bsv1 1 1 1 1 2 2
Bv1 1 1 1 1 1 1
Bx1 1 1 1 1 1 1
Ch1 1 1 1 1 1 1
Cv1 1 1 1 1 1 5
Dc1 1 1 1 1 1 1
Dh1 1 1 1 1 1 4
Dx1 1 1 1 1 1 1
Ec1 1 1 1 1 1 1
Ecl1 1 1 1 1 1 2
El1 1 1 1 1 1 4
Elx1 1 1 1 1 1 1
Ex1 1 1 1 1 1 1
Fs1 1 1 1 1 1 1
Fsv1 1 1 1 1 1 2
Fv1 1 1 1 1 1 1
Gt1 1 1 1 1 1 1
Gtx1 1 1 1 1 1 1
Hs1 1 1 1 1 1 1
Ht1 1 1 1 1 1 1
Ic1 1 1 1 1 1 3
Ik1 1 1 1 1 1 1
Ipv1 1 1 1 1 1 2
Ipx1 1 1 1 1 1 2
Is1 1 1 1 1 1 1
Iv1 1 1 1 1 1 1
Jkx1 1 1 1 1 1 1
Jx1 1 1 1 1 1 1
Kk1 1 1 1 1 1 1
Kl1 1 1 1 1 1 1
Lk1 1 1 1 1 1 1
Ll1 1 1 1 1 1 1
Lv1 1 1 1 1 1 1
Mcl1 1 1 1 1 1 2
Mhl1 1 1 1 1 1 2
Ml1 1 1 1 1 1 1
Nk1 1 1 1 1 1 1
Nkp1 1 1 1 1 2 2
Np1 1 1 1 1 1 1

and the subscale’s H is now a much better 0.294, (0.013), [0.269, 0.320]:

Homogeneity for the kept subscale.
Item H se 95% ci
Ak1 0.265 (0.033) [0.200, 0.330]
Akt1 0.263 (0.037) [0.191, 0.335]
At1 0.308 (0.028) [0.252, 0.363]
Bh1 0.310 (0.031) [0.250, 0.371]
Bhx1 0.235 (0.040) [0.156, 0.313]
Bs1 0.344 (0.027) [0.292, 0.397]
Bsv1 0.245 (0.033) [0.180, 0.311]
Bv1 0.306 (0.036) [0.236, 0.376]
Bx1 0.292 (0.037) [0.219, 0.364]
Ch1 0.299 (0.030) [0.240, 0.358]
Cv1 0.266 (0.030) [0.207, 0.324]
Dc1 0.292 (0.029) [0.235, 0.349]
Dh1 0.265 (0.032) [0.202, 0.329]
Dx1 0.290 (0.033) [0.225, 0.355]
Ec1 0.300 (0.031) [0.240, 0.360]
Ecl1 0.281 (0.033) [0.216, 0.345]
El1 0.268 (0.035) [0.200, 0.336]
Elx1 0.313 (0.034) [0.246, 0.380]
Ex1 0.308 (0.036) [0.237, 0.379]
Fs1 0.364 (0.027) [0.311, 0.416]
Fsv1 0.257 (0.032) [0.195, 0.320]
Fv1 0.334 (0.030) [0.275, 0.393]
Gt1 0.305 (0.032) [0.242, 0.368]
Gtx1 0.307 (0.025) [0.258, 0.355]
Hs1 0.320 (0.032) [0.257, 0.384]
Ht1 0.299 (0.030) [0.240, 0.358]
Ic1 0.268 (0.032) [0.205, 0.331]
Ik1 0.283 (0.034) [0.217, 0.348]
Ipv1 0.255 (0.047) [0.163, 0.346]
Ipx1 0.270 (0.029) [0.213, 0.326]
Is1 0.326 (0.028) [0.270, 0.381]
Iv1 0.340 (0.027) [0.287, 0.394]
Jkx1 0.283 (0.030) [0.224, 0.343]
Jx1 0.320 (0.036) [0.249, 0.391]
Kk1 0.316 (0.030) [0.257, 0.376]
Kl1 0.362 (0.037) [0.290, 0.434]
Lk1 0.306 (0.029) [0.249, 0.363]
Ll1 0.322 (0.034) [0.256, 0.388]
Lv1 0.278 (0.033) [0.213, 0.342]
Mcl1 0.301 (0.027) [0.247, 0.355]
Mhl1 0.260 (0.031) [0.200, 0.320]
Ml1 0.308 (0.034) [0.241, 0.375]
Nk1 0.305 (0.030) [0.247, 0.364]
Nkp1 0.232 (0.042) [0.149, 0.315]
Np1 0.321 (0.031) [0.261, 0.381]

22 items do not meet the local independence criterion (Ak1, Akt1, Bhx1, Bsv1, Ch1, Cv1, Dh1, Ec1, Ecl1, Elx1, Fs1, Fsv1, Gt1, Gtx1, Ipv1, Ipx1, Iv1, Jkx1, Lk1, Mcl1, Mhl1, Nkp1) and are excluded from the analysis, leaving the 23 items At1, Bh1, Bs1, Bv1, Bx1, Dc1, Dx1, El1, Ex1, Fv1, Hs1, Ht1, Ic1, Ik1, Is1, Jx1, Kk1, Kl1, Ll1, Lv1, Ml1, Nk1, Np1.

Monotonicity tests for the remaining items are shown below for default minsize:

Table S20. Monotonicity.
ItemH #ac #vi #vi/#ac maxvi sum sum/#ac zmax #zsig crit
At1 0.32 3 0 0 0 0 0 0 0 0
Bh1 0.35 3 0 0 0 0 0 0 0 0
Bs1 0.37 3 0 0 0 0 0 0 0 0
Bv1 0.34 3 0 0 0 0 0 0 0 0
Bx1 0.34 3 0 0 0 0 0 0 0 0
Dc1 0.33 3 0 0 0 0 0 0 0 0
Dx1 0.33 3 0 0 0 0 0 0 0 0
El1 0.29 3 0 0 0 0 0 0 0 0
Ex1 0.33 3 0 0 0 0 0 0 0 0
Fv1 0.37 3 0 0 0 0 0 0 0 0
Hs1 0.35 3 0 0 0 0 0 0 0 0
Ht1 0.34 3 0 0 0 0 0 0 0 0
Ic1 0.29 3 0 0 0 0 0 0 0 0
Ik1 0.31 3 0 0 0 0 0 0 0 0
Is1 0.33 3 0 0 0 0 0 0 0 0
Jx1 0.35 3 0 0 0 0 0 0 0 0
Kk1 0.35 3 0 0 0 0 0 0 0 0
Kl1 0.41 3 0 0 0 0 0 0 0 0
Ll1 0.34 3 0 0 0 0 0 0 0 0
Lv1 0.31 3 0 0 0 0 0 0 0 0
Ml1 0.34 3 0 0 0 0 0 0 0 0
Nk1 0.33 3 0 0 0 0 0 0 0 0
Np1 0.35 3 0 0 0 0 0 0 0 0

Invariant item ordering (IIO) tests are shown below for default minsize:

Table S21. Invariant item ordering (IIO).
ItemH #ac #vi #vi/#ac maxvi sum sum/#ac zmax #zsig crit
At1 0.32 44 1 0.02 0.04 0.04 0.0009 0.46 0 11
Bh1 0.35 44 1 0.02 0.04 0.04 0.0009 0.46 0 9
Bs1 0.37 44 2 0.05 0.09 0.12 0.0028 1.28 0 23
Bv1 0.34 44 0 0.00 0.00 0.00 0.0000 0.00 0 0
Bx1 0.34 44 0 0.00 0.00 0.00 0.0000 0.00 0 0
Dc1 0.33 44 0 0.00 0.00 0.00 0.0000 0.00 0 0
Dx1 0.33 44 5 0.11 0.05 0.22 0.0049 0.72 0 30
El1 0.29 44 1 0.02 0.06 0.06 0.0013 0.81 0 17
Ex1 0.33 44 0 0.00 0.00 0.00 0.0000 0.00 0 0
Fv1 0.37 44 1 0.02 0.05 0.05 0.0011 0.62 0 10
Hs1 0.35 44 0 0.00 0.00 0.00 0.0000 0.00 0 0
Ht1 0.34 44 1 0.02 0.06 0.06 0.0013 0.81 0 14
Ic1 0.29 44 4 0.09 0.09 0.28 0.0064 2.11 1 55
Ik1 0.31 44 0 0.00 0.00 0.00 0.0000 0.00 0 0
Is1 0.33 44 2 0.05 0.09 0.14 0.0032 1.36 0 27
Jx1 0.35 44 0 0.00 0.00 0.00 0.0000 0.00 0 0
Kk1 0.35 44 0 0.00 0.00 0.00 0.0000 0.00 0 0
Kl1 0.41 44 0 0.00 0.00 0.00 0.0000 0.00 0 0
Ll1 0.34 44 0 0.00 0.00 0.00 0.0000 0.00 0 0
Lv1 0.31 44 0 0.00 0.00 0.00 0.0000 0.00 0 0
Ml1 0.34 44 0 0.00 0.00 0.00 0.0000 0.00 0 0
Nk1 0.33 44 0 0.00 0.00 0.00 0.0000 0.00 0 0
Np1 0.35 44 2 0.05 0.06 0.11 0.0025 2.11 1 38

and the subscale’s H is now a much better 0.337, (0.024), [0.290, 0.384]:

Homogeneity for the kept subscale.
Item H se 95% ci
At1 0.315 (0.036) [0.245, 0.386]
Bh1 0.349 (0.038) [0.275, 0.423]
Bs1 0.367 (0.034) [0.300, 0.434]
Bv1 0.340 (0.041) [0.260, 0.419]
Bx1 0.336 (0.044) [0.250, 0.423]
Dc1 0.326 (0.039) [0.250, 0.403]
Dx1 0.327 (0.039) [0.251, 0.404]
El1 0.292 (0.041) [0.212, 0.373]
Ex1 0.332 (0.042) [0.249, 0.414]
Fv1 0.366 (0.036) [0.296, 0.437]
Hs1 0.349 (0.042) [0.267, 0.431]
Ht1 0.339 (0.037) [0.267, 0.412]
Ic1 0.287 (0.039) [0.211, 0.363]
Ik1 0.309 (0.040) [0.232, 0.387]
Is1 0.333 (0.034) [0.266, 0.400]
Jx1 0.351 (0.041) [0.270, 0.432]
Kk1 0.354 (0.038) [0.280, 0.427]
Kl1 0.412 (0.046) [0.321, 0.503]
Ll1 0.341 (0.039) [0.265, 0.417]
Lv1 0.308 (0.040) [0.230, 0.386]
Ml1 0.344 (0.039) [0.267, 0.421]
Nk1 0.333 (0.037) [0.261, 0.405]
Np1 0.349 (0.037) [0.276, 0.421]

However, it can be seen that this subscale is composed entirely of ‘same’ items, which suggests that IRT/Mokken are not appropriate for analyzing such data, but they were, nevertheless, extremely useful in helping to detect the “weird” items.

2.4.6 Rewind and restart: dealing with the “weird” items

Given the above analyses, it seems that the optimal way forward is:

  1. the “weird” items (Bhx, Hst, Ist, Kkl1 and Llp, in all presentations and variants) require some sort of “special treatment”, prompting us to perform three independent analyses using identical methods, as follows:
    • keeping these items as they are (denoted as “original”),
    • removing these items (denoted as “removed”), and
    • recoding them as ‘same’ items (denoted as “recoded”).
  2. summarizing the participants’ performance on this task as:
    • the % of correct responses across all items and presentations (this is a very simple, natural and frequently used approach),
    • as suggested by the PCA and EFA analyses, we estimate the % of correct responses separately for the ‘same’ and ‘different’ items, and
    • using Signal Detection Theory (Macmillan & Creelman, 1991), we estimate the sensitivity and the bias.

2.4.6.1 The three datasets

These three tone datasets all have the same number of participants (492, as we did not yet remove any outliers), but different number of items: 208 (original), 188 (removed), and 208 (recoded) respectively.

As a reminder, there are 20 “weird” items in total (Bhx1, Bhx2, Bxh1, Bxh2, Hst1, Hst2, Hts1, Hts2, Ist1, Ist2, Its1, Its2, Kkl1, Kkl2, Klk1, Klk2, Llp1, Llp2, Lpl1, Lpl2).

2.4.6.2 The three performance estimates

2.4.6.2.1 % of correct responses across all items and presentations
**Figure S42.** Histogram of the % correct on each of the three datasets. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS42'/>

Figure S42. Histogram of the % correct on each of the three datasets. Figure generated using R version 4.3.3 (2024-02-29)

  pc_tot_orig pc_tot_remv pc_tot_recd
pc_tot_orig 1 0.97 0.9
pc_tot_remv 0.97 1 0.97
pc_tot_recd 0.9 0.97 1
**Figure S43.** Heatmap with clustering for the Spearman correlations between % correct total across the three datasets. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS43'/>

Figure S43. Heatmap with clustering for the Spearman correlations between % correct total across the three datasets. Figure generated using R version 4.3.3 (2024-02-29)

The correlations between the total % correct across the three datasets are very large (≥0.90) suggesting that we can simply use any one of them.

2.4.6.2.2 % of correct responses for ‘same’ and ‘different’ itmes
**Figure S44.** Histogram of the % correct 'same' (top) and 'different' (bottom) on each of the three datasets. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS44'/>

Figure S44. Histogram of the % correct ‘same’ (top) and ‘different’ (bottom) on each of the three datasets. Figure generated using R version 4.3.3 (2024-02-29)

Table continues below
  pc_same_orig pc_diff_orig pc_same_remv pc_diff_remv
pc_same_orig 1 0.23 1 0.34
pc_diff_orig 0.23 1 0.23 0.97
pc_same_remv 1 0.23 1 0.34
pc_diff_remv 0.34 0.97 0.34 1
pc_same_recd 0.94 0.08 0.94 0.24
pc_diff_recd 0.34 0.97 0.34 1
  pc_same_recd pc_diff_recd
pc_same_orig 0.94 0.34
pc_diff_orig 0.08 0.97
pc_same_remv 0.94 0.34
pc_diff_remv 0.24 1
pc_same_recd 1 0.24
pc_diff_recd 0.24 1
**Figure S45.** Heatmap with clustering for the Spearman correlations between % correct for 'same' and 'different' itmes across the three datasets. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS45'/>

Figure S45. Heatmap with clustering for the Spearman correlations between % correct for ‘same’ and ‘different’ itmes across the three datasets. Figure generated using R version 4.3.3 (2024-02-29)

It is a similar story here, with the % correct for the ‘same’ items being highly intercorrelated (≥0.94) as are those for the ‘different’ items (≥0.97), but the correlations between these two types are low (between 0.09 and 0.35) with the lowest for the ‘original’ dataset; therefore we will use the results for the ‘remove’ (or the ‘recode’) dataset.

2.4.6.2.3 Signal Detection Theory (sensitivity and bias)

We estimate the following measures (using psycho::dprime(), see Pallier (2002), https://bookdown.org/danbarch/psy_207_advanced_stats_I/signal-detection-theory.html and https://www.birmingham.ac.uk/Documents/college-les/psych/vision-laboratory/sdtintro.pdf for details):

  • d’ and beta (or c): the “classic” measures of sensitivity (d’) and bias (beta):
    • d’ is theoretically positive and varies between 0.0 (no sensitivity) to +∞ (in practice, up to ~3), with higher values representing higher sensitivity (please note that it is possible to obtain small negative estimates when there is no capacity to detect the signal but this should be treated as essentially 0.0)
    • beta varies between 0.0 (bias towards ‘different’) to +∞ (bias towards ‘same’), where 1.0 represents an unbiased observer,
    • c is an alternative estimator of the bias representing the number of standard deviations from the midpoint between the ‘signal, and the ’signal + noise’ distributions, and is a symmetric continuum from negative values (preferring ‘different’ responses), to 0 (“unbiased”) and to positive values (preferring ‘same’ responses);
  • A’ and B’’D: are the non-parametric counterparts of d’ and beta:
    • A’ near 0.5 represents chance, while closer to 1.0 represents good detection,
    • B’‘D varies between -1 (bias towards ’different’) and 1 (bias towards ‘same’), with 0 being an unbiased observer.
**Figure S46.** Histogram of the % correct 'same' (top) and 'different' (bottom) on each of the three datasets. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS46'/>

Figure S46. Histogram of the % correct ‘same’ (top) and ‘different’ (bottom) on each of the three datasets. Figure generated using R version 4.3.3 (2024-02-29)

Table continues below
  dprime_orig beta_orig c_orig aprime_orig bppd_orig
dprime_orig 1 0.8 0.55 0.92 0.8
beta_orig 0.8 1 0.9 0.54 1
c_orig 0.55 0.9 1 0.29 0.91
aprime_orig 0.92 0.54 0.29 1 0.55
bppd_orig 0.8 1 0.91 0.55 1
dprime_remv 0.98 0.71 0.46 0.95 0.71
beta_remv 0.51 0.89 0.91 0.21 0.89
c_remv 0.28 0.74 0.92 -0.01 0.75
aprime_remv 0.91 0.55 0.3 0.98 0.56
bppd_remv 0.53 0.89 0.92 0.23 0.9
dprime_recd 0.93 0.69 0.46 0.9 0.69
beta_recd 0.26 0.74 0.84 -0.06 0.73
c_recd 0.07 0.59 0.82 -0.23 0.6
aprime_recd 0.9 0.59 0.35 0.95 0.6
bppd_recd 0.25 0.74 0.85 -0.07 0.73
Table continues below
  dprime_remv beta_remv c_remv aprime_remv bppd_remv
dprime_orig 0.98 0.51 0.28 0.91 0.53
beta_orig 0.71 0.89 0.74 0.55 0.89
c_orig 0.46 0.91 0.92 0.3 0.92
aprime_orig 0.95 0.21 -0.01 0.98 0.23
bppd_orig 0.71 0.89 0.75 0.56 0.9
dprime_remv 1 0.38 0.14 0.96 0.39
beta_remv 0.38 1 0.92 0.21 0.99
c_remv 0.14 0.92 1 -0.02 0.92
aprime_remv 0.96 0.21 -0.02 1 0.23
bppd_remv 0.39 0.99 0.92 0.23 1
dprime_recd 0.97 0.36 0.13 0.94 0.37
beta_recd 0.14 0.92 0.91 -0.03 0.9
c_recd -0.05 0.81 0.95 -0.21 0.81
aprime_recd 0.96 0.25 0.03 0.99 0.27
bppd_recd 0.13 0.91 0.92 -0.04 0.9
  dprime_recd beta_recd c_recd aprime_recd bppd_recd
dprime_orig 0.93 0.26 0.07 0.9 0.25
beta_orig 0.69 0.74 0.59 0.59 0.74
c_orig 0.46 0.84 0.82 0.35 0.85
aprime_orig 0.9 -0.06 -0.23 0.95 -0.07
bppd_orig 0.69 0.73 0.6 0.6 0.73
dprime_remv 0.97 0.14 -0.05 0.96 0.13
beta_remv 0.36 0.92 0.81 0.25 0.91
c_remv 0.13 0.91 0.95 0.03 0.92
aprime_remv 0.94 -0.03 -0.21 0.99 -0.04
bppd_remv 0.37 0.9 0.81 0.27 0.9
dprime_recd 1 0.18 -0.01 0.97 0.17
beta_recd 0.18 1 0.93 0.05 1
c_recd -0.01 0.93 1 -0.14 0.94
aprime_recd 0.97 0.05 -0.14 1 0.04
bppd_recd 0.17 1 0.94 0.04 1
**Figure S47.** Heatmap with clustering for the Spearman correlations between SDT sensitivity and bias measures across the three datasets. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS47'/>

Figure S47. Heatmap with clustering for the Spearman correlations between SDT sensitivity and bias measures across the three datasets. Figure generated using R version 4.3.3 (2024-02-29)

First, the signal estimates d’ and A’ are very highly intercorrelated across datasets (≥0.93 for d’ and ≥0.95 for A’) and between them (between 0.91 and 0.96). Second, within each dataset, beta and c have high correlations of ~0.9. Between datasets, beta for ‘remove’ seems to have the best intercorrelations (0.92 with ‘recode’ and 0.89 with ‘original’) while the correlation for ‘original’ and ‘recode’ is 0.75; however, c shows much higher intercorrelations (between 0.82 and 0.95). B’‘D for ’original’ is moderately correlated with the other two (0.73 with ‘recode’ and 0.90 with ‘remove’), while these two correlate at 0.90. Within datasets, B’’D correlates very strongly with c (around 0.92) and almost perfectly with beta (≥0.99).

Coupled with the more natural interpretation of some of these estimates estimates, we will focus primarily on d’ and c, but also keep A’ and B’’D in mind.

2.4.6.2.4 Combining the estimates

From the above, it seems that either the ‘remove’ or the ‘recode’ datasets would be best if we had to use a single dataset, and there seems to be a slight data-driven advantage for the former (in the sense that it has the best correlations with the other two datasets), but at the cost of completely losing the information provided by the “weird” items – given these, we will continue using the three datasets for now. In terms of actual estimates, we will keep the % correct responses overall and for the ‘same’ and ‘different’ items separately, while for the Signal Detection Theory we will focus on d’ and c.

However, first we will detect those participants with very high biases (one way or another):

**Figure S48.** Histogram of the bias c on each of the three datasets (repeated from above). Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS48'/>

Figure S48. Histogram of the bias c on each of the three datasets (repeated from above). Figure generated using R version 4.3.3 (2024-02-29)

It can be seen that in all three datasets there is an overall bias towards answering ‘same’ (c > 0, highly significant in all three cases), but much smaller (and statistically highly significantly so) for the ‘remove’ (0.3) and especially ‘recode’ (0.2) than for the ‘original’ (0.52) dataset.

**Figure S49.** Histogram of the bias c on each of the three datasets. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS49'/>**Figure S49.** Histogram of the bias c on each of the three datasets. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS49'/>**Figure S49.** Histogram of the bias c on each of the three datasets. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS49'/>

Figure S49. Histogram of the bias c on each of the three datasets. Figure generated using R version 4.3.3 (2024-02-29)

It is clear that there are some participants with strong biases in all three datasets; let’s identify them. First, there is one participant which systematically answered “same” irrespective of the item:

Table S22. One participant systematcally answered ‘same’
age gender music_years education_years location
51 F 0 2 B

Second, there are no other participants with strong biases (≤-2 or ≥2) in any dataset and only a few with c between -1 and 1, so let’ remove this one only.

Interestingly, while for the “original” dataset there are quite a few participants with biases |c|≥1 (2 with c≤-1 and 34 with c≥1), there are fewer for the “remove” (3 with c≤-1 and 14 with c≥1) and for the “recode” (3 with c≤-1 and 11 with c≥1) datasets, again supporting the idea that the “weird” items are treated as ‘same’ and not ‘different’.

Focusing on the “de-biased” signal sensitivity d’, it interesting to note that it is higher, on average, on the “recoded” (2.3) dataset than on the “removed” (2.49) dataset and than on the “original” (2.04) dataset (all differences are highly significant as judged with two-sample t-tests), again supporting the view that the “weird” items are ‘same’ and not ‘different’.

Let’s look at the relationships between estimates within datasets:

**Figure S50.** Heatmap with clustering for the Spearman correlations between various measures in each dataset. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS50'/>**Figure S50.** Heatmap with clustering for the Spearman correlations between various measures in each dataset. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS50'/>**Figure S50.** Heatmap with clustering for the Spearman correlations between various measures in each dataset. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS50'/>

Figure S50. Heatmap with clustering for the Spearman correlations between various measures in each dataset. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S51.** The three main measures of success on each dataset. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS51'/>

Figure S51. The three main measures of success on each dataset. Figure generated using R version 4.3.3 (2024-02-29)

d’ and A’ are highly correlated (≥ 0.92), as expected, and each is highly correlated with the % total correct responses: d’ (between 0.87 and 0.96) and A’ is virtually perfectly correlated (≥0.97).

2.4.6.2.5 Consider the “weird” items separately

Keeping, removing or recoding the “weird” items all share the downside that they ignore the participants that seemingly answered correctly for these items. Therefore, we will consider these items as ‘different’ items, together with their associated ‘same’ items, separately.

**Figure S52.** Histograms of the measures on the 'weird' dataset. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS52'/>

Figure S52. Histograms of the measures on the ‘weird’ dataset. Figure generated using R version 4.3.3 (2024-02-29)

A expected (see Table below), there is a low overall % correct responses due to the very low % correct responses on the ‘different’ items, which is reflected in rather low sensitivities, with strong biases towards ‘same’ responses compared with the other three datasets:

Table S23. Comparing the means of the measures across all four datasets.
dataset % total % same % different d’ beta c A’ B’’D
original 78.6 89.0 68.1 2.04 5.69 0.52 0.85 0.56
remove 84.5 89.0 79.0 2.49 3.71 0.29 0.88 0.37
recode 83.8 87.1 79.0 2.30 2.54 0.20 0.88 0.22
weird 60.7 90.0 22.6 0.71 3.09 1.16 0.69 0.60

Importantly, the maximum % correct is 94.2% overall, 100% for ‘same’ and 97.1% for ‘different’, suggesting that same participants seem to have answered correctly. Let’s see who these participants are and how their responses on the ‘non-weird’ items are like:

**Figure S53.** The measures of success for the 'weird' and 'non-weird' ('remove') items. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS53'/>

Figure S53. The measures of success for the ‘weird’ and ‘non-weird’ (‘remove’) items. Figure generated using R version 4.3.3 (2024-02-29)

And let’s look at their correlations:

**Figure S54.** Heatmap with clustering for the correlations between all measures for the 'weird' and 'non-weird' ('remove' dataset) items. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS54'/>

Figure S54. Heatmap with clustering for the correlations between all measures for the ‘weird’ and ‘non-weird’ (‘remove’ dataset) items. Figure generated using R version 4.3.3 (2024-02-29)

The important relationships are those between the two sets (“weird” vs “non-weird” items) for:

% total correct responses: their correlation is 0.50 overall, but this is likely heavily biased by the ‘same’ items (which, as expected, behaves the same for the “weird” and “non-weird” items),

% correct for the ‘different’ items: while overall there is no correlation (-0.03, p>0.05), there seem to be two or three groups of participants (cluster::clusGap() suggests 3 clusters, while NbClust::NbClust() finds that 8 methods suggest 2 clusters and 7 suggest 3):
**Figure S55.** K-means clustering of the participants using the % correct for 'different' on the 'weird' and 'non-weird' items, with k=2 (left) and k=3 (right), trying to keep the colors and symbols of the corresponding clusters similar. Please note the order of the 'numbers' (i.e., '1', '2' and '3') in the legend is arbitrary. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS55'/>

Figure S55. K-means clustering of the participants using the % correct for ‘different’ on the ‘weird’ and ‘non-weird’ items, with k=2 (left) and k=3 (right), trying to keep the colors and symbols of the corresponding clusters similar. Please note the order of the ‘numbers’ (i.e., ‘1’, ‘2’ and ‘3’) in the legend is arbitrary. Figure generated using R version 4.3.3 (2024-02-29)

It can be seen that the 3-clusters solution is more or less splitting one of the clusters of the 2-clusters solution, with the rough correspondences between the k=2 and k=3 clusters 1(2) ≈ 1(3) [red in the figure] and 2(2) ≈ 2(3)+3(3) [shades of blue in the figure], where the first digit is the cluster ‘number’ (which is arbitrary) and in parentheses the number of clusters, k, so 1(2) means cluster ‘1’ of the k=2 solution. Within each cluster, we have the following correlations between the “weird” and “non-weird” % correct:

Table S24. Correlations (Pearson and Spearman) between % correct ‘different’ for the ‘weird’ and ‘non-weird’ items on the full dataset (1 cluster) and for each cluster separately (the same cluster notations as in the figure above).
# clusters cluster 1 cluster 2 cluster 3
full dataset r=-0.18, p=9.34×10-5; ρ=-0.03, p=0.557 - -
k=2 r=0.45, p=3.92×10-6; ρ=0.37, p=1.94×10-4 r=0.36, p=1.66×10-13; ρ=0.36, p=3.22×10-13 -
k=3 r=0.49, p=1.29×10-6; ρ=0.42, p=5.73×10-5 r=0.09, p=0.281; ρ=0.11, p=0.193 r=0.21, p=4.63×10-4; ρ=0.21, p=6.84×10-4

It can be seen that, while on the full dataset there basically is no correlation (Pearson’s seems affected by a few outliers), in clusters 1(2), 1(3), 2(2) and 3(3) there is a strong positive and highly significant correlation, while on cluster 2(3) there is no correlation.

For k=2, cluster 1(2) comprises only 95 or 19.4% participants which seem to respond at chance level to the “non-weird” ‘different’ items (between 7.1% and 81%, mean = 47.5% and median = 48.8%) and have low % for the “weird” items (between 0% and 70%, mean = 37% and median = 40%). Here, there is a significant positive relationship between the two % correct measures, the % correct of the “non-weird” items predicting very well that for the “weird” items (liner regression β=0.63±0.13, p=3.92×10-6).

The same pattern holds for the corresponding k=3, cluster 1(3): this comprises 86 or 17.6% participants which seem to respond at chance level to the “non-weird” ‘different’ items (between 7.1% and 65.5%, mean = 45.6% and median = 47%) and have low % for the “weird” items (between 0% and 70%, mean = 37.2% and median = 40%). Here, there is a significant positive relationship between the two % correct measures, the % correct of the “non-weird” items predicting very well that for the “weird” items (liner regression β=0.73±0.14, p=1.29×10-6).

For k=2, cluster 2(2) comprises the vast majority of the participants (394 or 80.6%) which have high % correct responses for the “non-weird” ‘different’ items (≥ 59.5%, mean = 86.6% and median = 88.1%) but mostly low % for the “weird” items (≤ 90%, mean = 19.1% and median = 15%). Here, there is a significant positive relationship between the two % correct measures, the % correct of the “non-weird” items predicting that for the “weird” items (liner regression β=0.60±0.08, p=1.66×10-13).

This group roughly splits in two for k=3: cluster 2(3) that comprises 86 or 17.6% the participants with relatively high % for the “non-weird” items (≥ 7.1%, mean = 45.6% and median = 47%) and very low for the “weird” items (≤ 70%, mean = 37.2% and median = 40%), basically those that clearly treated the “weird” items as being of the ‘same’ type. There is no relationship between the two % correct measures (β=0.15±0.14, p=0.281). The other part forms cluster 3(3) that comprises 265 or 54.2% the participants with high % for the “non-weird” items (≥ 51.2%, mean = 84.3% and median = 85.7%) and low for the “weird” items (≤ 25%, mean = 11% and median = 10%). However, here there is no relationship between the two % correct measures, with a significant positive prediction of % correct of the “weird” items from the “non-weird” ones (β=0.15±0.04, p=4.63×10-4).

Importantly, the % correct for the “weird” items (which are nominally ‘different’) have very strong and highly significant negative correlations with the % correct for the ‘same’ “non-weird” items, and their linear regressions have negative and slopes β of comparable size (-0.90 ≤ βs ≤ -0.50) on the whole dataset and in each cluster separately:

Table S25. Correlations (Pearson and Spearman) between % correct for the ‘weird’ and for the ‘same’ for the ‘non-weird’ items on the full dataset (1 cluster) and for each cluster separately (the same cluster notations as in the figure above).
# clusters cluster 1 cluster 2 cluster 3
full dataset r=-0.69, p=6.14×10-70; ρ=-0.64, p=2.78×10-58 - -
k=2 r=-0.77, p=6.53×10-20; ρ=-0.61, p=3.42×10-11 r=-0.54, p=6.42×10-31; ρ=-0.51, p=4.68×10-27 -
k=3 r=-0.79, p=3.1×10-19; ρ=-0.61, p=6.67×10-10 r=-0.57, p=2.64×10-13; ρ=-0.28, p=7.08×10-4 r=-0.30, p=9.95×10-7; ρ=-0.29, p=1.77×10-6
**Figure S56.** The % correct responses for the 'weird' (y axis) vs the % correct responses foe the *'same'* 'non-weird' (aka 'remove') items (x axis) overall (left), for k=2 (middle) and k=3 (right), keeping the same cluster colors and symbols as above. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS56'/>

Figure S56. The % correct responses for the ‘weird’ (y axis) vs the % correct responses foe the ‘same’ ‘non-weird’ (aka ‘remove’) items (x axis) overall (left), for k=2 (middle) and k=3 (right), keeping the same cluster colors and symbols as above. Figure generated using R version 4.3.3 (2024-02-29)

This is precisely the pattern to be expected if the “weird” items behave like the ‘same’ items: the participants with higher performance on the same items have lower performance on the “weird” items because their responses are in fact “flipped”.

2.4.6.3 Conclusions

It turns out that the tone task is far from simple, in particular, that the 5 items Bhx, Hst, Ist, Kkl1 and Llp (so-called “weird” items) clearly behave like the ‘same’ items and not as the other ‘different’ items. While this is interesting question from a phonological/phonetic point of view, we must leave this for future research.

Here, the relevant questions is “what to do with these items?”. There are three basic options: (1) leave them as they are (the “original” dataset), (2) simply drop them (the “removed” dataset), or (3) consider them as ‘same’ items and “flip” their responses (the “recode” dataset). From the analyses performed above and theoretical considerations, we will focus on the “recode” dataset but also keep the original dataset for reference.

With these we have the following primary and secondary measures:

  • primary measures: these are estimates of the performance at discriminating the tones: % correct responses overall and d’ on the “original” and the “recode” datasets, and
  • secondary measures:
    • the % correct responses overall and d’ on the “remove” dataset, and
    • % correct responses for ‘same’ and ‘different’ items separately, and the measure of bias c on all three datasets.

Finally, please note that while d’ can arguably be modeled well using linear regression, for the % correct responses we have several potential choices:

  1. linear regression on the % of correct responses: this is arguably wrong as it assumes values outside the [0,100] range and we will not pursue it here except for complex path models where beta regression (see below) does not currently work and where the fitted models using both approaches are very similar.
  2. beta regression on the proportion of correct responses (i.e., on the %/100.0): this is a technique implemented in R by, for example, glmmTMB(..., family=beta_family()), standardly used to model proportion data, but is has two drawbacks: it cannot deal with proportions of exactly 0.0 and 1.0, requiring these to be converted to something almost 0.0 and 1.0 respectively, and, for most participants, it produces relatively hard-to-interpret estimates.
  3. logistic regression on the actual responses (or, equivalently, on the counts of “0” and “1” responses): this is probably the best to model our data as logistic regression is very well supported in R and most practitioners know how to interpret its results.

Therefore, we will systematically perform logistic regression on the counts of “0” (“incorrect”) and “1” (“correct”) responses, but we might still plot and show the % of correct responses when appropriate.

3 Results

3.1 Correlations between measures

Pearson’s:

**Figure S57.** Heatmap with clustering for the Pearson's correlations between the conitnuous measures of interest. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS57'/>

Figure S57. Heatmap with clustering for the Pearson’s correlations between the conitnuous measures of interest. Figure generated using R version 4.3.3 (2024-02-29)

Spearman’s:

**Figure S58.** Heatmap with clustering for the Spearman's correlations between the conitnuous measures of interest. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS58'/>

Figure S58. Heatmap with clustering for the Spearman’s correlations between the conitnuous measures of interest. Figure generated using R version 4.3.3 (2024-02-29)

It can be seen that:

  • the results using either correlation estimate are very similar,
  • age is negatively correlated (significantly or not) with all the other variables, especially strongly with years of education and the working memory task, and much less with the various measures of tone task performance,
  • gender is correlated with years of education (higher for males),
  • the location correlates with years of education (slightly higher in A) and with the performance on the tone task (especially the ‘different’ items, better for A),
  • % correct answers and d’ on both datasets behave very similarly: negatively affected by age, by location (better in A) and for more education, gender (slightly better for males), and positively effected by working memory,
  • % correct answers for ‘different’ items on both datasets are very similar and like % total and d’ above,
  • % correct answers for the ‘same’ items on both datasets are very similar and like % total and d’ above,
  • finally, the bias estimators c behave strikingly differently on both datasets: while on the ‘original’ it is negatively affected by age and positively by education and working memory, on the ‘recoded’ it does not seem influenced by any such covariate.

These bi-variate correlations, however, might hide more complex relationships between the measures of interest and covariates, that can be tested using multiple regression, mediation and (piecewise) path and structural models.

3.2 The working memory task

We use here the normalized working memory performance estimate (wm_norm).

3.2.1 Location and family

There are no significant differences between the two main locations (A and B) in terms of the working memory task performance (linear regression βB-A=-0.0075, p=0.688), and, for those 91 participants with information about family relationships, the generation they belong to also has no effect (linear regression βold-young=-0.081, p=0.122), and, moreover, there is no clustering within families (the linear model with family as a random effect has an ICC of 2.4%), suggesting that we need not model these factors here.

3.2.2 Multiple regression

A linear multiple linear regression of the working memory task performance on age, gender and years of education and all their interactions simplifies (using manual simplification based on F-test’s p value) to a model with main effects only:


Call:
lm(formula = wm_norm ~ age + gender + education_years, data = d)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.45016 -0.09722 -0.00813  0.09104  0.50216 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)      0.5539387  0.0365383  15.160   <2e-16 ***
age             -0.0058270  0.0006713  -8.680   <2e-16 ***
genderM         -0.0320631  0.0142637  -2.248    0.025 *  
education_years  0.0206672  0.0020358  10.152   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1425 on 485 degrees of freedom
Multiple R-squared:  0.509, Adjusted R-squared:  0.506 
F-statistic: 167.6 on 3 and 485 DF,  p-value: < 2.2e-16
**Figure S59.** Untransformed slopes with standard errors. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS59'/>

Figure S59. Untransformed slopes with standard errors. Figure generated using R version 4.3.3 (2024-02-29)

suggesting that age has a negative effect, years of education a positive effect, and that males have worse performance than the females. However, it is possible that the causal model is more complex, with the effect of gender and age largely mediated by the years of education.

3.2.3 Mediation: gender → education → working memory

Indeed, fitting a mediation model where gender influences working memory through years of education (N.B. while the outcome model is a linear regression of working memory performance on the mediator and treatment, the mediator model is the Poisson regression of years of education on the treatment because the years of education is a count variable) finds a highly significant positive indirect effect (ACME), but also a significant direct effect (ADE); as fitted using mediation::mediate():


Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME            0.07474      0.04916         0.10  <2e-16 ***
ADE            -0.04001     -0.07098        -0.01   0.015 *  
Total Effect    0.03473     -0.00461         0.07   0.084 .  
Prop. Mediated  2.05904     -7.70715        13.87   0.084 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 489 


Simulations: 10000 

and as fitted using piecewiseSEM::psem():


Structural Equation Model of wm_task_results$med_gender__education$piecewise$model 

Call:
  education_years ~ gender_n
  wm_norm ~ gender_n + education_years

    AIC
 2833.359

---
Tests of directed separation:

 No independence claims present. Tests of directed separation not possible.

--
Global goodness-of-fit:

Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years        gender_n   0.3653    0.0373 487     9.7828  0.0000
          wm_norm        gender_n  -0.0403    0.0153 486    -2.6356  0.0087
          wm_norm education_years   0.0320    0.0017 486    19.1132  0.0000
  Std.Estimate    
        0.2563 ***
       -0.0932  **
        0.6755 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke      0.18
          wm_norm       none      0.43

Figure S60. Mediation model of gender, years of education and working memory performance showing the standardized coefficients (fitted using piecewiseSEM::psem()). Figure generated using R version 4.3.3 (2024-02-29)

Moreover, testing this partial mediation model against the full mediation model (that does not include a direct effect but only the indirect effect) using d-separation and model comparison finds that while the effect of gender is mostly mediated through years of education (males have more and years of education are positively related to working memory), the direct effect of gender on working memory also matters (d-sep p=0.009, model comparison χ2(1)=6.9, p=0.008).

3.2.4 Mediation: age → education → working memory

Likewise, fitting a mediation model where age influences working memory through years of education finds a highly significant negative direct effect (ADE), but also a significant negative mediated effect (ACME); as fitted using mediation::mediate():


Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME           -0.01300     -0.02510         0.00   0.024 *  
ADE            -0.00594     -0.00754         0.00  <2e-16 ***
Total Effect   -0.01894     -0.03072        -0.01   0.001 ***
Prop. Mediated  0.68258      0.22255         0.83   0.023 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 489 


Simulations: 10000 

and as fitted using piecewiseSEM::psem():


Structural Equation Model of wm_task_results$med_age__education$piecewise$model 

Call:
  education_years ~ age
  wm_norm ~ age + education_years

    AIC
 2289.804

---
Tests of directed separation:

 No independence claims present. Tests of directed separation not possible.

--
Global goodness-of-fit:

Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years             age  -0.0326    0.0013 487   -24.2750       0
          wm_norm             age  -0.0059    0.0007 486    -8.8119       0
          wm_norm education_years   0.0196    0.0020 486     9.8602       0
  Std.Estimate    
       -0.6394 ***
       -0.3691 ***
        0.4130 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke       0.7
          wm_norm       none       0.5

Figure S61. Mediation model of age, years of education and working memory performance showing the standardized coefficients (fitted using piecewiseSEM::psem()). Figure generated using R version 4.3.3 (2024-02-29)

Moreover, the effect of age is split between the direct negative effect and the mediated (negative effect on years of education) effect (d-sep p=2.19×10-17, model comparison χ2(1)=72.5, p=0).

3.2.5 Conclusions

The working memory performance is not influenced by the location (as expected) and does not cluster within families, but is strongly positively correlated with years of education (causality is unclear but could be bidirectional), is strongly negatively influenced by age both directly and indirectly by negatively influencing the years of education, and by gender, mainly indirectly as the males have more years of education, but also directly, as the males seem to have slightly worse performance than the females.

3.3 The tone task

Given that the tone task results can be potentially analyses in several ways, we conducted a preliminary comparison to decide what would be the best approach here:

  1. “flat” logistic regression (i.e., the participants and the items are “pooled” together): this has the advantage that is a very simple and well-understood regression model and works on the actual responses by a given individual to a given item, allowing thus, in principle, the modelling of item characteristics (e.g., ‘same’ vs ‘different’, ‘weird’ vs ‘non-weird’ or if it contains an actual word), but suffers from the same drawback as all models that ignore the hierarchical structure of the data namely that their estimation of significance may be extremely biased,
  2. mixed-effects logistic regression including the participant and item as crossed random effects (aka ... + (1 | participant) + (1 | item)) which keeps the advantages of (a) but properly deal with its main shortcoming, and
  3. beta regression on the % correct responses: this collapses the responses across items for a given participant addressing the issues of participant (but not of item) clustering.

We compared these on our data and we found that:

  • all these methods produced comparable regression coefficient estimates, but
  • method (a) is inappropriate as it results in overly optimistic p-values,
  • the (adjusted) ICC of the participants and items is around 40%, confirming that we need to somehow address this clustering (and confirming the inadequacy of (a)),
  • methods (b) and (c) agree not only in the point estimates but also in their standard errors and p-values
  • method (b) is slower and may have convergence issues for complex fixed effects structures
  • by using a single estimate per participant, (c) is similar both with the approach in Wong et al. (2020) and with the use of d’
  • likewise, it is easier to introduce in mediation models using other variables of interest that have a single value per participant (e.g., gender, working memory, years of education…).

Therefore, we will perform the beta regression of the % correct responses, using a mixed-effects logistic regression as sanity check in some cases.

3.3.1 % correct total (recode)

We focus here on the % total correct responses on the ‘recoded’ dataset estimate (pcr = percent correct recoded). Given that these are percents bounded, by definition, between 0% and 100%, we used beta regression (as implemented by glmmTMB::glmmTMB()); however, the mediation modelling function mediation::mediate has troubles with beta regression, as has piecewiseSEM::psem() when it comes to fitting the full mediation model, so we employed in these cases the equivalent linear regressions (the relevant coefficient estimates and p-values are similar enough between the two to ensure that the qualitative conclusions hold). Moreover, piecewiseSEM has troubles estimating the standardized path coefficients, so we report here the unstandardized ones.

3.3.1.1 Location and family

There is a highly significant difference between the two main locations (A has higher overall performance than B; beta regression βB-A=-0.45, p=1.45×10-9). For those 91 participants with information about family relationships, the generation they belong to has no effect (beta regression βold-young=-0.019, p=0.931). However, there is a slight clustering within families (the linear model with family as a random effect has an ICC of 9.9%, and including family as a fixed effect in a “flat” beta regression vs excluding it results in a p=0.016), but given the loss of sample size and the relatively small amount of variation explained we will ignore it here.

3.3.1.2 Multiple regression

A beta multiple linear regression on working memory, age, gender, years of education and location and all their interactions simplifies (using manual simplification based on F-test’s p value) to a model with main effects and two 2-way interactions only:

 Family: beta  ( logit )
Formula:          
pcr ~ age + gender + education_years + location_ab + wm_norm +  
    age:gender + age:education_years
Data: d

     AIC      BIC   logLik deviance df.resid 
  -940.6   -903.3    479.3   -958.6      454 


Dispersion parameter for beta family (): 11.5 

Conditional model:
                      Estimate Std. Error z value Pr(>|z|)    
(Intercept)          1.6384067  0.4358957   3.759 0.000171 ***
age                 -0.0147449  0.0084924  -1.736 0.082518 .  
genderM             -0.6487713  0.2521834  -2.573 0.010093 *  
education_years     -0.0640475  0.0389790  -1.643 0.100356    
location_abB        -0.3905770  0.0691881  -5.645 1.65e-08 ***
wm_norm              1.4110569  0.2426382   5.815 6.05e-09 ***
age:genderM          0.0171747  0.0062264   2.758 0.005809 ** 
age:education_years  0.0030296  0.0008416   3.600 0.000318 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
**Figure S62.** Untransformed slopes with standard errors. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS62'/>

Figure S62. Untransformed slopes with standard errors. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S63.** Predicted values of pcr showing the interaction of*gender* and *age*. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS63'/>

Figure S63. Predicted values of pcr showing the interaction ofgender and age. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S64.** Predicted values of pcr showing the interaction of *education_years* and *age*. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS64'/>

Figure S64. Predicted values of pcr showing the interaction of education_years and age. Figure generated using R version 4.3.3 (2024-02-29)

suggesting that age and years of education have no main effects, that working memory does have a positive effect, and that males have better performance than the females but there is an interaction between education and age, and between age and gender, and that participants from A have better performance than those from B. As above, we test more complex of causality using mediation and path analysis.

3.3.1.3 Mediation: gender → education → tone

The mediation model where gender influences tone through years of education finds a highly significant positive indirect effect (ACME) and no direct effect (ADE); as fitted using mediation::mediate() (with linear regression):


Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME            0.03816      0.02488         0.05  <2e-16 ***
ADE            -0.00215     -0.02616         0.02  0.8598    
Total Effect    0.03600      0.00838         0.06  0.0082 ** 
Prop. Mediated  1.05577      0.61086         3.48  0.0082 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 

and as fitted using piecewiseSEM::psem() (using linear regression):


Structural Equation Model of tone_pcr_results$med_gender__education$piecewise$linearreg$model 

Call:
  education_years ~ gender_n
  pcr ~ gender_n + education_years

    AIC
 2490.046

---
Tests of directed separation:

 No independence claims present. Tests of directed separation not possible.

--
Global goodness-of-fit:

Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years        gender_n   0.3947    0.0388 461    10.1653  0.0000
              pcr        gender_n  -0.0020    0.0133 460    -0.1537  0.8779
              pcr education_years   0.0154    0.0015 460    10.4643  0.0000
  Std.Estimate    
             - ***
       -0.0067    
        0.4528 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke       0.2
              pcr       none       0.2

Figure S65. Mediation model of gender, years of education and pcr showing the unstandardized coefficients (fitted using piecewiseSEM::psem()); dotted arrows are not significant. Figure generated using R version 4.3.3 (2024-02-29)

Moreover, testing this partial mediation model against the full mediation model using d-separation and model comparison (N.B., using linear regression) finds that there is no direct effect of gender, but that it is entirely mediated through years of education (d-sep p=0.878, model comparison χ2(1)=0.024, p=0.877).

3.3.1.4 Mediation: age → education → tone

Likewise, fitting a mediation model where age influences tone through years of education finds a significant positive direct (ADE) and a significant negative mediated effect (ACME); as fitted using mediation::mediate() (with linear regression):


Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

                Estimate 95% CI Lower 95% CI Upper p-value   
ACME           -0.013198    -0.025383          0.0  0.0226 * 
ADE             0.001814     0.000634          0.0  0.0016 **
Total Effect   -0.011384    -0.023323          0.0  0.0494 * 
Prop. Mediated  1.151918     0.992957          2.1  0.0268 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 

and as fitted using piecewiseSEM::psem() (using linear regression):


Structural Equation Model of tone_pcr_results$med_age__education$piecewise$linearreg$model 

Call:
  education_years ~ age
  pcr ~ age + education_years

    AIC
 2040.451

---
Tests of directed separation:

 No independence claims present. Tests of directed separation not possible.

--
Global goodness-of-fit:

Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years             age  -0.0335    0.0014 461   -23.8174  0.0000
              pcr             age   0.0018    0.0006 460     2.8325  0.0048
              pcr education_years   0.0188    0.0019 460    10.1289  0.0000
  Std.Estimate    
             - ***
        0.1543  **
        0.5516 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke      0.71
              pcr       none      0.22

Figure S66. Mediation model of age, years of education and pcr showing the unstandardized coefficients (fitted using piecewiseSEM::psem()). Figure generated using R version 4.3.3 (2024-02-29)

Moreover, the effect of age is split between the direct positive effect and the mediated (negative effect on years of education) effect (d-sep p=0.005, model comparison χ2(1)=8.0, p=0.0047).

3.3.1.5 Mediation: location → education → tone

Likewise, fitting a mediation model where location influences tone through years of education finds significant negative (i.e., that B has worse performance than A) direct (ADE) and mediated (ACME) effects; as fitted using mediation::mediate() (with linear regression):


Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME            -0.0218      -0.0352        -0.01   2e-04 ***
ADE             -0.0470      -0.0708        -0.02   4e-04 ***
Total Effect    -0.0688      -0.0950        -0.04  <2e-16 ***
Prop. Mediated   0.3165       0.1577         0.52   2e-04 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 

and as fitted using piecewiseSEM::psem() (using linear regression):


Structural Equation Model of tone_pcr_results$med_location__education$piecewise$linearreg$model 

Call:
  education_years ~ location_bin
  pcr ~ location_bin + education_years

    AIC
 2529.685

---
Tests of directed separation:

 No independence claims present. Tests of directed separation not possible.

--
Global goodness-of-fit:

Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years    location_bin  -0.2587    0.0385 461    -6.7199   0e+00
              pcr    location_bin  -0.0472    0.0120 460    -3.9431   1e-04
              pcr education_years   0.0144    0.0014 460    10.1260   0e+00
  Std.Estimate    
             - ***
       -0.1641 ***
        0.4214 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke      0.10
              pcr       none      0.23

Figure S67. Mediation model of age, years of education and pcr showing the unstandardized coefficients (fitted using piecewiseSEM::psem()). Figure generated using R version 4.3.3 (2024-02-29)

Moreover, the effect of location is split between the direct and the mediated effects (d-sep p=9.3×10-5, model comparison χ2(1)=15.4, p=1e-04).

3.3.1.6 Mediation: education, age, gender → working memory → tone

Here we want to check if there is any effect of working memory on tone above and beyond that of education, age and gender, and we found both significant positive direct (ADE) and mediated (ACME) effects; as fitted using mediation::mediate() (with linear regression):


Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME            0.00360      0.00178         0.01  <2e-16 ***
ADE             0.01542      0.01152         0.02  <2e-16 ***
Total Effect    0.01902      0.01545         0.02  <2e-16 ***
Prop. Mediated  0.18836      0.09467         0.30  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 

and as fitted using piecewiseSEM::psem() (using linear regression):


Structural Equation Model of tone_pcr_results$med_wm__all$piecewise$linearreg$model 

Call:
  wm_norm ~ education_years + gender_n + age
  pcr ~ wm_norm + education_years + gender_n + age

    AIC
 -1094.782

---
Tests of directed separation:

 No independence claims present. Tests of directed separation not possible.

--
Global goodness-of-fit:

Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom

---
Coefficients:

  Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
   wm_norm education_years   0.0198    0.0021 459     9.3059  0.0000
   wm_norm        gender_n  -0.0277    0.0146 459    -1.8919  0.0591
   wm_norm             age  -0.0061    0.0007 459    -8.5939  0.0000
       pcr         wm_norm   0.1813    0.0415 458     4.3701  0.0000
       pcr education_years   0.0154    0.0021 458     7.4803  0.0000
       pcr        gender_n  -0.0007    0.0131 458    -0.0525  0.9582
       pcr             age   0.0029    0.0007 458     4.3316  0.0000
  Std.Estimate    
        0.4191 ***
       -0.0649    
       -0.3744 ***
        0.2515 ***
        0.4528 ***
       -0.0022    
        0.2507 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

  Response method R.squared
   wm_norm   none      0.50
       pcr   none      0.25

Figure S68. Mediation model of education, age, gender, wm and pcr showing the unstandardized coefficients (fitted using piecewiseSEM::psem()). Figure generated using R version 4.3.3 (2024-02-29)

It can be seen that, indeed, working memory has a positive effect on tone while also mediating some of the effects of age and education.

3.3.1.7 Complex path model

Finally, we fitted a complex path model involving all these factors using piecewiseSEM::psem() (with linear regression):


Structural Equation Model of tone_pcr_results$path_model$linearreg$model 

Call:
  education_years ~ age + gender_n + location_bin
  age ~ location_bin + gender_n
  gender_n ~ location_bin
  wm_norm ~ education_years + gender_n + age
  pcr ~ wm_norm + age + gender_n + education_years + location_bin

    AIC
 5655.388

---
Tests of directed separation:

                Independ.Claim Test.Type  DF Crit.Value P.Value 
  wm_norm ~ location_bin + ...      coef 458     1.7014  0.0895 

--
Global goodness-of-fit:

Chi-Squared = 2.917 with P-value = 0.088 and on 1 degrees of freedom
Fisher's C = 4.826 with P-value = 0.09 and on 2 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years             age  -0.0318    0.0014 459   -22.4979  0.0000
  education_years        gender_n   0.2755    0.0393 459     7.0071  0.0000
  education_years    location_bin  -0.2420    0.0385 459    -6.2790  0.0000
              age    location_bin  -0.1792    1.1363 460    -0.1577  0.8747
              age        gender_n  -2.8378    1.2138 460    -2.3379  0.0198
         gender_n    location_bin  -0.0181    0.1986 461    -0.0910  0.9275
          wm_norm education_years   0.0198    0.0021 459     9.3059  0.0000
          wm_norm        gender_n  -0.0277    0.0146 459    -1.8919  0.0591
          wm_norm             age  -0.0061    0.0007 459    -8.5939  0.0000
              pcr         wm_norm   0.1942    0.0410 457     4.7385  0.0000
              pcr             age   0.0026    0.0007 457     3.8377  0.0001
              pcr        gender_n   0.0030    0.0129 457     0.2311  0.8173
              pcr education_years   0.0133    0.0021 457     6.2998  0.0000
              pcr    location_bin  -0.0470    0.0119 457    -3.9514  0.0001
  Std.Estimate    
             - ***
             - ***
             - ***
       -0.0073    
       -0.1084   *
        -0.005    
        0.4191 ***
       -0.0649    
       -0.3744 ***
        0.2693 ***
        0.2206 ***
        0.0097    
         0.389 ***
       -0.1633 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke      0.76
              age       none      0.01
         gender_n nagelkerke      0.00
          wm_norm       none      0.50
              pcr       none      0.27

Figure S69. The complex path model for pcr showing the unstandardized coefficients (fitted using piecewiseSEM::psem()). Figure generated using R version 4.3.3 (2024-02-29)

This confirms and extends the previous simpler mediation models in that pcr is directly influenced by working memory (positively), years of education (more is better), age (older is better) and location (A is better), but also indirectly by gender (mediated through years of education and age); both age and location also have effects mediated through years of education, and age and years of education have effects mediated through working memory.

3.3.1.8 Conclusions

The tone task performance as measured by the % correct total responses in the recoded dataset (aka pcr) does somewhat cluster within families (but we ignore this here due to the low sample size and variance explained), but is done better by participants from A, who are older, who have more education and who have better working memory performance, while gender has only an indirect effect.

3.3.2 d’ (recode)

We focus here on the d’ estimate (dpr = d prime recoded). Its distribution seems relatively normal and diagnostics (not shown) suggest that, indeed, a linear model fits this variable quite well. However, piecewiseSEM has troubles estimating (some of) the standardized path coefficients, so we report here the unstandardized ones.

3.3.2.1 Location and family

There is a highly significant difference between the two main locations (A has higher overall performance than B; βB-A=-0.63, p=1.56×10-10). For those 91 participants with information about family relationships, the generation they belong to has no effect (βold-young=0.12, p=0.62), and, moreover, there is no clustering within families (the beta model with family as a random effect has an ICC of 4.4%), suggesting that we need not model these factors here.

3.3.2.2 Multiple regression

A beta multiple linear regression of the working memory task performance on age, gender, years of education and location and all their interactions simplifies (using manual simplification based on F-test’s p value) to a model with main effects and one interaction only:


Call:
lm(formula = dpr ~ age + gender + education_years + location_ab + 
    wm_norm + age:education_years + gender:education_years, data = d)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.82011 -0.46418  0.07605  0.60458  2.14666 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)              1.910244   0.506963   3.768 0.000186 ***
age                     -0.013304   0.010025  -1.327 0.185169    
genderM                  0.621352   0.194244   3.199 0.001476 ** 
education_years         -0.055608   0.047563  -1.169 0.242956    
location_abB            -0.481857   0.084157  -5.726 1.87e-08 ***
wm_norm                  1.639536   0.290392   5.646 2.90e-08 ***
age:education_years      0.003958   0.001014   3.901 0.000110 ***
genderM:education_years -0.090083   0.024266  -3.712 0.000231 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.8705 on 455 degrees of freedom
Multiple R-squared:  0.3593,    Adjusted R-squared:  0.3495 
F-statistic: 36.46 on 7 and 455 DF,  p-value: < 2.2e-16
**Figure S70.** Untransformed slopes with standard errors. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS70'/>

Figure S70. Untransformed slopes with standard errors. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S71.** Predicted values of dpr showing the interaction of *education_years* and *gender*. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS71'/>

Figure S71. Predicted values of dpr showing the interaction of education_years and gender. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S72.** Predicted values of dpr showing the interaction of *education_years* and *age*. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS72'/>

Figure S72. Predicted values of dpr showing the interaction of education_years and age. Figure generated using R version 4.3.3 (2024-02-29)

suggesting that age and years of education have no main effects, that working memory has a main positive effect, that males have better performance than the females but there is an interaction between education and age, and between education and gender, and that participants from A have better performance than those from B. As above, we test more complex of causality using mediation and path analysis.

3.3.2.3 Mediation: gender → education → tone

The mediation model where gender influences tone through years of education finds a highly significant positive indirect effect (ACME) and no direct effect (ADE); as fitted using mediation::mediate() (with linear regression):


Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME             0.2904       0.1874         0.40  <2e-16 ***
ADE             -0.0112      -0.1980         0.17    0.91    
Total Effect     0.2792       0.0679         0.49    0.01 *  
Prop. Mediated   1.0329       0.5950         3.32    0.01 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 

and as fitted using piecewiseSEM::psem():


Structural Equation Model of tone_dpr_results$med_gender__education$piecewise$model 

Call:
  education_years ~ gender_n
  dpr ~ gender_n + education_years

    AIC
 4349.853

---
Tests of directed separation:

 No independence claims present. Tests of directed separation not possible.

--
Global goodness-of-fit:

Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years        gender_n   0.3947    0.0388 461    10.1653  0.0000
              dpr        gender_n  -0.0111    0.0992 460    -0.1121  0.9108
              dpr education_years   0.1179    0.0110 460    10.7169  0.0000
  Std.Estimate    
             - ***
       -0.0048    
        0.4613 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke      0.20
              dpr       none      0.21

Figure S73. Mediation model of gender, years of education and dpr showing the unstandardized coefficients (fitted using piecewiseSEM::psem()); dotted arrows are not significant. Figure generated using R version 4.3.3 (2024-02-29)

Moreover, testing this partial mediation model against the full mediation model using d-separation and model comparison finds that there is no direct effect of gender, but that it is entirely mediated through years of education (d-sep p=0.911, model comparison χ2(1)=0.013, p=0.909).

3.3.2.4 Mediation: age → education → tone

Likewise, fitting a mediation model where age influences tone through years of education finds a significant positive direct (ADE) and a significant negative mediated effect (ACME); as fitted using mediation::mediate() (with linear regression):


Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME           -0.10432     -0.19937        -0.01  0.0242 *  
ADE             0.01632      0.00703         0.03  0.0002 ***
Total Effect   -0.08800     -0.18221         0.00  0.0572 .  
Prop. Mediated  1.17639      0.59488         2.30  0.0330 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 

and as fitted using piecewiseSEM::psem():


Structural Equation Model of tone_dpr_results$med_age__education$piecewise$model 

Call:
  education_years ~ age
  dpr ~ age + education_years

    AIC
 3896.584

---
Tests of directed separation:

 No independence claims present. Tests of directed separation not possible.

--
Global goodness-of-fit:

Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years             age  -0.0335    0.0014 461   -23.8174   0e+00
              dpr             age   0.0163    0.0047 460     3.4264   7e-04
              dpr education_years   0.1484    0.0138 460    10.7601   0e+00
  Std.Estimate    
             - ***
        0.1849 ***
        0.5807 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke      0.71
              dpr       none      0.23

Figure S74. Mediation model of age, years of education and dpr showing the unstandardized coefficients (fitted using piecewiseSEM::psem()). Figure generated using R version 4.3.3 (2024-02-29)

Moreover, the effect of age is split between the direct positive effect and the mediated (negative effect on years of education) effect (d-sep p=6.66×10-4, model comparison χ2(1)=11.7, p=6e-04).

3.3.2.5 Mediation: location → education → tone

Likewise, fitting a mediation model where location influences tone through years of education finds significant negative (i.e., that B has worse performance than A) direct (ADE) and mediated (ACME) effects; as fitted using mediation::mediate() (with linear regression):


Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME             -0.164       -0.261        -0.07   2e-04 ***
ADE              -0.466       -0.633        -0.30  <2e-16 ***
Total Effect     -0.629       -0.819        -0.44  <2e-16 ***
Prop. Mediated    0.259        0.131         0.40   2e-04 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 

and as fitted using piecewiseSEM::psem():


Structural Equation Model of tone_dpr_results$med_location__education$piecewise$model 

Call:
  education_years ~ location_bin
  dpr ~ location_bin + education_years

    AIC
 4377.625

---
Tests of directed separation:

 No independence claims present. Tests of directed separation not possible.

--
Global goodness-of-fit:

Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years    location_bin  -0.2587    0.0385 461    -6.7199       0
              dpr    location_bin  -0.4654    0.0881 460    -5.2803       0
              dpr education_years   0.1076    0.0104 460    10.3027       0
  Std.Estimate    
             - ***
       -0.2158 ***
        0.4211 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke      0.10
              dpr       none      0.26

Figure S75. Mediation model of age, years of education and dpr showing the unstandardized coefficients (fitted using piecewiseSEM::psem()). Figure generated using R version 4.3.3 (2024-02-29)

Moreover, the effect of location is split between the direct and the mediated effects (d-sep p=1.99×10-7, model comparison χ2(1)=27.2, p=0).

3.3.2.6 Mediation: education, age, gender → working memory → tone

Here we want to check if there is any effect of working memory on tone above and beyond that of education, age and gender, and we found both significant positive direct (ADE) and mediated (ACME) effects; as fitted using mediation::mediate() (with linear regression):


Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME             0.0307       0.0170         0.05  <2e-16 ***
ADE              0.1195       0.0901         0.15  <2e-16 ***
Total Effect     0.1502       0.1229         0.18  <2e-16 ***
Prop. Mediated   0.2034       0.1120         0.31  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 

and as fitted using piecewiseSEM::psem():


Structural Equation Model of psem_dpr_wm 

Call:
  wm_norm ~ education_years + gender_n + age
  dpr ~ wm_norm + education_years + gender_n + age

    AIC
 755.093

---
Tests of directed separation:

 No independence claims present. Tests of directed separation not possible.

--
Global goodness-of-fit:

Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom

---
Coefficients:

  Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
   wm_norm education_years   0.0198    0.0021 459     9.3059  0.0000
   wm_norm        gender_n  -0.0277    0.0146 459    -1.8919  0.0591
   wm_norm             age  -0.0061    0.0007 459    -8.5939  0.0000
       dpr         wm_norm   1.5471    0.3059 458     5.0572  0.0000
       dpr education_years   0.1194    0.0152 458     7.8449  0.0000
       dpr        gender_n  -0.0011    0.0963 458    -0.0113  0.9910
       dpr             age   0.0259    0.0050 458     5.1699  0.0000
  Std.Estimate    
        0.4191 ***
       -0.0649    
       -0.3744 ***
        0.2865 ***
        0.4674 ***
       -0.0005    
        0.2945 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

  Response method R.squared
   wm_norm   none      0.50
       dpr   none      0.27

Figure S76. Mediation model of education, age, gender, wm and dpr showing the unstandardized coefficients (fitted using piecewiseSEM::psem()). Figure generated using R version 4.3.3 (2024-02-29)

It can be seen that, indeed, working memory has a positive effect on tone while also mediating some of the effects of age and education.

3.3.2.7 Complex path model

Finally, we fitted a complex path model involving all these factors using piecewiseSEM::psem():


Structural Equation Model of psem_dpr_path 

Call:
  education_years ~ age + gender_n + location_bin
  age ~ location_bin + gender_n
  gender_n ~ location_bin
  wm_norm ~ education_years + gender_n + age
  dpr ~ wm_norm + age + gender_n + education_years + location_bin

    AIC
 7492.614

---
Tests of directed separation:

                Independ.Claim Test.Type  DF Crit.Value P.Value 
  wm_norm ~ location_bin + ...      coef 458     1.7014  0.0895 

--
Global goodness-of-fit:

Chi-Squared = 2.917 with P-value = 0.088 and on 1 degrees of freedom
Fisher's C = 4.826 with P-value = 0.09 and on 2 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years             age  -0.0318    0.0014 459   -22.4979  0.0000
  education_years        gender_n   0.2755    0.0393 459     7.0071  0.0000
  education_years    location_bin  -0.2420    0.0385 459    -6.2790  0.0000
              age    location_bin  -0.1792    1.1363 460    -0.1577  0.8747
              age        gender_n  -2.8378    1.2138 460    -2.3379  0.0198
         gender_n    location_bin  -0.0181    0.1986 461    -0.0910  0.9275
          wm_norm education_years   0.0198    0.0021 459     9.3059  0.0000
          wm_norm        gender_n  -0.0277    0.0146 459    -1.8919  0.0591
          wm_norm             age  -0.0061    0.0007 459    -8.5939  0.0000
              dpr         wm_norm   1.6736    0.2980 457     5.6161  0.0000
              dpr             age   0.0224    0.0049 457     4.5670  0.0000
              dpr        gender_n   0.0351    0.0938 457     0.3739  0.7087
              dpr education_years   0.0980    0.0153 457     6.3950  0.0000
              dpr    location_bin  -0.4637    0.0865 457    -5.3576  0.0000
  Std.Estimate    
             - ***
             - ***
             - ***
       -0.0073    
       -0.1084   *
        -0.005    
        0.4191 ***
       -0.0649    
       -0.3744 ***
        0.3099 ***
        0.2549 ***
        0.0152    
        0.3833 ***
        -0.215 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke      0.76
              age       none      0.01
         gender_n nagelkerke      0.00
          wm_norm       none      0.50
              dpr       none      0.32

Figure S77. The complex path model for dpr showing the unstandardized coefficients (fitted using piecewiseSEM::psem()). Figure generated using R version 4.3.3 (2024-02-29)

This confirms and extends the previous simpler mediation models in that dpr is directly influenced by working memory (positively), years of education (more is better), age (older is better) and location (A is better), but also indirectly by gender (mediated through years of education and age); both age and location also have effects mediated through years of education, and age and years of education have effects mediated through working memory.

3.3.2.8 Conclusions

As expected, d’ and % correct total responses in the recoded dataset behave very similarly, down to very similar coefficient estimates and associated p-values. We compared (using AIC) the two measures (a rather “apples to pears” comparison in this case) using the multiple regression model ΔAIC(%cr, d’) = -2136.0, and the complex path model ΔAIC(%cr, d’) = -1837.2, both suggesting that %cr fits the data much better than d’.

3.3.3 Secondary measures

Here we look at secondary measures.

3.3.3.1 % correct total (original)

We focus here on the % total correct responses on the ‘original’ dataset estimate (pco = percent correct original) – see the comments above for pcr.

3.3.3.1.1 Location and family

There is a highly significant difference between the two main locations (A has higher overall performance than B; beta regression βB-A=-0.3, p=2.73×10-7). For those 91 participants with information about family relationships, the generation they belong to has no effect (beta regression βold-young=0.07, p=0.619) and there is no clustering within families (the linear model with family as a random effect has an ICC of 9.8%, and including family as a fixed effect in a “flat” beta regression vs excluding it results in a p=0.295).

3.3.3.1.2 Multiple regression
 Family: beta  ( logit )
Formula:          
pco ~ age + gender + education_years + location_ab + wm_norm +  
    gender:education_years + education_years:wm_norm
Data: d

     AIC      BIC   logLik deviance df.resid 
  -909.2   -872.0    463.6   -927.2      454 


Dispersion parameter for beta family (): 17.8 

Conditional model:
                         Estimate Std. Error z value Pr(>|z|)    
(Intercept)             -0.037382   0.199472  -0.187 0.851344    
age                      0.010988   0.003276   3.354 0.000796 ***
genderM                  0.336120   0.122824   2.737 0.006208 ** 
education_years          0.116932   0.016936   6.904 5.04e-12 ***
location_abB            -0.243294   0.053010  -4.590 4.44e-06 ***
wm_norm                  1.608457   0.298826   5.383 7.34e-08 ***
genderM:education_years -0.052539   0.015700  -3.346 0.000819 ***
education_years:wm_norm -0.107255   0.034818  -3.080 0.002067 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
**Figure S78.** Untransformed slopes with standard errors. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS78'/>

Figure S78. Untransformed slopes with standard errors. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S79.** Predicted values of pco showing the interaction of*gender* and *age*. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS79'/>

Figure S79. Predicted values of pco showing the interaction ofgender and age. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S80.** Predicted values of pco showing the interaction of *education_years* and *age*. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS80'/>

Figure S80. Predicted values of pco showing the interaction of education_years and age. Figure generated using R version 4.3.3 (2024-02-29)

3.3.3.1.3 Mediation: gender → education → tone

Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME            0.03391      0.02207         0.05  <2e-16 ***
ADE            -0.00220     -0.02378         0.02    0.84    
Total Effect    0.03171      0.00673         0.06    0.01 ** 
Prop. Mediated  1.06049      0.61053         3.69    0.01 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 

Structural Equation Model of tone_pco_results$med_gender__education$piecewise$linearreg$model 

Call:
  education_years ~ gender_n
  pco ~ gender_n + education_years

    AIC
 2376.695

---
Tests of directed separation:

 No independence claims present. Tests of directed separation not possible.

--
Global goodness-of-fit:

Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years        gender_n   0.3947    0.0388 461    10.1653  0.0000
              pco        gender_n  -0.0022    0.0118 460    -0.1836  0.8544
              pco education_years   0.0137    0.0013 460    10.5265  0.0000
  Std.Estimate    
             - ***
       -0.0079    
         0.455 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke      0.20
              pco       none      0.21

Figure S81. Mediation model. Figure generated using R version 4.3.3 (2024-02-29)

d-sep p=0.854, model comparison χ2(1)=0.034, p=0.854

3.3.3.1.4 Mediation: age → education → tone

Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

                Estimate 95% CI Lower 95% CI Upper p-value   
ACME           -0.011765    -0.022675         0.00  0.0202 * 
ADE             0.001689     0.000639         0.00  0.0028 **
Total Effect   -0.010077    -0.020841         0.00  0.0486 * 
Prop. Mediated  1.160614     0.986975         2.16  0.0286 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 

Structural Equation Model of tone_pco_results$med_age__education$piecewise$linearreg$model 

Call:
  education_years ~ age
  pco ~ age + education_years

    AIC
 1926.169

---
Tests of directed separation:

 No independence claims present. Tests of directed separation not possible.

--
Global goodness-of-fit:

Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years             age  -0.0335    0.0014 461   -23.8174  0.0000
              pco             age   0.0017    0.0006 460     2.9957  0.0029
              pco education_years   0.0169    0.0016 460    10.2867  0.0000
  Std.Estimate    
             - ***
        0.1628  **
        0.5591 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke      0.71
              pco       none      0.22

Figure S82. Mediation model. Figure generated using R version 4.3.3 (2024-02-29)

d-sep p=0.003, model comparison χ2(1)=8.9, p=0.0028.

3.3.3.1.5 Mediation: location → education → tone

Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME            -0.0194      -0.0309        -0.01   8e-04 ***
ADE             -0.0403      -0.0607        -0.02   2e-04 ***
Total Effect    -0.0597      -0.0827        -0.04  <2e-16 ***
Prop. Mediated   0.3235       0.1610         0.53   8e-04 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 

Structural Equation Model of tone_pco_results$med_location__education$piecewise$linearreg$model 

Call:
  education_years ~ location_bin
  pco ~ location_bin + education_years

    AIC
 2417.395

---
Tests of directed separation:

 No independence claims present. Tests of directed separation not possible.

--
Global goodness-of-fit:

Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years    location_bin  -0.2587    0.0385 461    -6.7199   0e+00
              pco    location_bin  -0.0404    0.0106 460    -3.8039   2e-04
              pco education_years   0.0128    0.0013 460    10.1948   0e+00
  Std.Estimate    
             - ***
       -0.1583 ***
        0.4243 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke      0.10
              pco       none      0.23

Figure S83. Mediation model. Figure generated using R version 4.3.3 (2024-02-29)

d-sep p=1.62×10-4, model comparison χ2(1)=14.3, p=2e-04

3.3.3.1.6 Mediation: education, age, gender → working memory → tone

Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME            0.00324      0.00165         0.01  <2e-16 ***
ADE             0.01385      0.01047         0.02  <2e-16 ***
Total Effect    0.01709      0.01395         0.02  <2e-16 ***
Prop. Mediated  0.18824      0.09633         0.30  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 

Structural Equation Model of tone_pco_results$med_wm__all$piecewise$linearreg$model 

Call:
  wm_norm ~ education_years + gender_n + age
  pco ~ wm_norm + education_years + gender_n + age

    AIC
 -1209.875

---
Tests of directed separation:

 No independence claims present. Tests of directed separation not possible.

--
Global goodness-of-fit:

Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom

---
Coefficients:

  Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
   wm_norm education_years   0.0198    0.0021 459     9.3059  0.0000
   wm_norm        gender_n  -0.0277    0.0146 459    -1.8919  0.0591
   wm_norm             age  -0.0061    0.0007 459    -8.5939  0.0000
       pco         wm_norm   0.1634    0.0366 458     4.4599  0.0000
       pco education_years   0.0139    0.0018 458     7.6064  0.0000
       pco        gender_n  -0.0011    0.0115 458    -0.0920  0.9268
       pco             age   0.0027    0.0006 458     4.5253  0.0000
  Std.Estimate    
        0.4191 ***
       -0.0649    
       -0.3744 ***
        0.2559 ***
        0.4591 ***
       -0.0039    
        0.2611 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

  Response method R.squared
   wm_norm   none      0.50
       pco   none      0.25

Figure S84. Mediation model. Figure generated using R version 4.3.3 (2024-02-29)

3.3.3.1.7 Complex path model

Structural Equation Model of tone_pco_results$path_model$linearreg$model 

Call:
  education_years ~ age + gender_n + location_bin
  age ~ location_bin + gender_n
  gender_n ~ location_bin
  wm_norm ~ education_years + gender_n + age
  pco ~ wm_norm + age + gender_n + education_years + location_bin

    AIC
 5541.539

---
Tests of directed separation:

                Independ.Claim Test.Type  DF Crit.Value P.Value 
  wm_norm ~ location_bin + ...      coef 458     1.7014  0.0895 

--
Global goodness-of-fit:

Chi-Squared = 2.917 with P-value = 0.088 and on 1 degrees of freedom
Fisher's C = 4.826 with P-value = 0.09 and on 2 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years             age  -0.0318    0.0014 459   -22.4979  0.0000
  education_years        gender_n   0.2755    0.0393 459     7.0071  0.0000
  education_years    location_bin  -0.2420    0.0385 459    -6.2790  0.0000
              age    location_bin  -0.1792    1.1363 460    -0.1577  0.8747
              age        gender_n  -2.8378    1.2138 460    -2.3379  0.0198
         gender_n    location_bin  -0.0181    0.1986 461    -0.0910  0.9275
          wm_norm education_years   0.0198    0.0021 459     9.3059  0.0000
          wm_norm        gender_n  -0.0277    0.0146 459    -1.8919  0.0591
          wm_norm             age  -0.0061    0.0007 459    -8.5939  0.0000
              pco         wm_norm   0.1743    0.0362 457     4.8103  0.0000
              pco             age   0.0024    0.0006 457     4.0484  0.0001
              pco        gender_n   0.0020    0.0114 457     0.1795  0.8577
              pco education_years   0.0120    0.0019 457     6.4562  0.0000
              pco    location_bin  -0.0399    0.0105 457    -3.7875  0.0002
  Std.Estimate    
             - ***
             - ***
             - ***
       -0.0073    
       -0.1084   *
        -0.005    
        0.4191 ***
       -0.0649    
       -0.3744 ***
        0.2729 ***
        0.2323 ***
        0.0075    
         0.398 ***
       -0.1563 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke      0.76
              age       none      0.01
         gender_n nagelkerke      0.00
          wm_norm       none      0.50
              pco       none      0.28

Figure S85. The complex path model. Figure generated using R version 4.3.3 (2024-02-29)

3.3.3.1.8 Conclusions

pco behaves relatively similarly to pcr, and we compared (using AIC) the two (more meaningful here) using the multiple regression model ΔAIC(pcr, pco) = -31.3, and the complex path model ΔAIC(pcr, pco) = 113.8, both suggesting that pcr (the ‘reduced’ dataset) fits the data better than pco (the ‘original’ dataset).

3.3.3.2 d’ (original)

We focus here on d’ on the ‘original’ dataset estimate (dpo = d prime original) – see the comments above for dpr.

3.3.3.2.1 location and family

There is a highly significant difference between the two main locations (A has higher overall performance than B; βB-A=-0.63, p=1.56×10-10). For those 91 participants with information about family relationships, the generation they belong to has no effect (βold-young=0.12, p=0.62), and, moreover, there is no clustering within families (the beta model with family as a random effect has an ICC of 4.4%), suggesting that we need not model these factors here.

3.3.3.2.2 Multiple regression

Call:
lm(formula = dpo ~ age + gender + education_years + location_ab + 
    wm_norm + age:education_years + gender:education_years, data = d)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.53839 -0.35857  0.08772  0.51313  1.87452 

Coefficients:
                          Estimate Std. Error t value Pr(>|t|)    
(Intercept)              1.4281037  0.4467995   3.196 0.001489 ** 
age                     -0.0075783  0.0088356  -0.858 0.391504    
genderM                  0.5576186  0.1711920   3.257 0.001209 ** 
education_years         -0.0297572  0.0419181  -0.710 0.478137    
location_abB            -0.3529232  0.0741694  -4.758 2.63e-06 ***
wm_norm                  1.4759196  0.2559303   5.767 1.49e-08 ***
age:education_years      0.0030647  0.0008941   3.428 0.000664 ***
genderM:education_years -0.0730950  0.0213866  -3.418 0.000688 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.7672 on 455 degrees of freedom
Multiple R-squared:  0.3526,    Adjusted R-squared:  0.3427 
F-statistic: 35.41 on 7 and 455 DF,  p-value: < 2.2e-16
**Figure S86.** Untransformed slopes with standard errors. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS86'/>

Figure S86. Untransformed slopes with standard errors. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S87.** Predicted values of dpo showing the interaction of*gender* and *age*. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS87'/>

Figure S87. Predicted values of dpo showing the interaction ofgender and age. Figure generated using R version 4.3.3 (2024-02-29)

**Figure S88.** Predicted values of dpo showing the interaction of *education_years* and *age*. Figure generated using [`R`](https://www.r-project.org/) version 4.3.3 (2024-02-29)<a id='FigS88'/>

Figure S88. Predicted values of dpo showing the interaction of education_years and age. Figure generated using R version 4.3.3 (2024-02-29)

3.3.3.2.3 Mediation gender → education → tone

Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME             0.2583       0.1664         0.36  <2e-16 ***
ADE              0.0425      -0.1220         0.21  0.6134    
Total Effect     0.3008       0.1122         0.49  0.0014 ** 
Prop. Mediated   0.8609       0.5256         2.04  0.0014 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 

Structural Equation Model of tone_dpo_results$med_gender__education$piecewise$model 

Call:
  education_years ~ gender_n
  dpo ~ gender_n + education_years

    AIC
 4220.805

---
Tests of directed separation:

 No independence claims present. Tests of directed separation not possible.

--
Global goodness-of-fit:

Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years        gender_n   0.3947    0.0388 461    10.1653   0.000
              dpo        gender_n   0.0427    0.0863 460     0.4948   0.621
              dpo education_years   0.1046    0.0096 460    10.9325   0.000
  Std.Estimate    
             - ***
        0.0211    
        0.4669 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke      0.20
              dpo       none      0.22

Figure S89. Mediation model. Figure generated using R version 4.3.3 (2024-02-29)

d-sep p=0.621, model comparison χ2(1)=0.25, p=0.62

3.3.3.2.4 Mediation age → education → tone

Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME           -0.09232     -0.17748        -0.01  0.0250 *  
ADE             0.01412      0.00629         0.02  0.0004 ***
Total Effect   -0.07820     -0.16227         0.00  0.0544 .  
Prop. Mediated  1.17229      0.65176         2.14  0.0294 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 
       Length Class      Mode   
model  3      psem       list   
dsep.p 1      -none-     numeric
anova  1      anova.psem list   

Figure S90. Mediation model. Figure generated using R version 4.3.3 (2024-02-29)

d-sep p=6.86×10-4, model comparison χ2(1)=11.6, p=7e-04.

3.3.3.2.5 Mediation location → education → tone

Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME             -0.150       -0.239        -0.07   4e-04 ***
ADE              -0.336       -0.486        -0.19  <2e-16 ***
Total Effect     -0.487       -0.659        -0.32  <2e-16 ***
Prop. Mediated    0.307        0.159         0.48   4e-04 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 
       Length Class      Mode   
model  3      psem       list   
dsep.p 1      -none-     numeric
anova  1      anova.psem list   

Figure S91. Mediation model. Figure generated using R version 4.3.3 (2024-02-29)

d-sep p=1.62×10-5, model comparison χ2(1)=18.7, p=0

3.3.3.2.6 Mediation education, age, gender → working memory → tone

Causal Mediation Analysis 

Quasi-Bayesian Confidence Intervals

               Estimate 95% CI Lower 95% CI Upper p-value    
ACME             0.0280       0.0164         0.04  <2e-16 ***
ADE              0.1040       0.0785         0.13  <2e-16 ***
Total Effect     0.1320       0.1082         0.16  <2e-16 ***
Prop. Mediated   0.2110       0.1214         0.32  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sample Size Used: 463 


Simulations: 10000 
        Length Class        Mode
model   3      psem         list
summary 8      summary.psem list

Figure S92. Mediation model. Figure generated using R version 4.3.3 (2024-02-29)

3.3.3.2.7 Complex path model

Structural Equation Model of psem_dpo_path 

Call:
  education_years ~ age + gender_n + location_bin
  age ~ location_bin + gender_n
  gender_n ~ location_bin
  wm_norm ~ education_years + gender_n + age
  dpo ~ wm_norm + age + gender_n + education_years + location_bin

    AIC
 7369.870

---
Tests of directed separation:

                Independ.Claim Test.Type  DF Crit.Value P.Value 
  wm_norm ~ location_bin + ...      coef 458     1.7014  0.0895 

--
Global goodness-of-fit:

Chi-Squared = 2.917 with P-value = 0.088 and on 1 degrees of freedom
Fisher's C = 4.826 with P-value = 0.09 and on 2 degrees of freedom

---
Coefficients:

         Response       Predictor Estimate Std.Error  DF Crit.Value P.Value
  education_years             age  -0.0318    0.0014 459   -22.4979  0.0000
  education_years        gender_n   0.2755    0.0393 459     7.0071  0.0000
  education_years    location_bin  -0.2420    0.0385 459    -6.2790  0.0000
              age    location_bin  -0.1792    1.1363 460    -0.1577  0.8747
              age        gender_n  -2.8378    1.2138 460    -2.3379  0.0198
         gender_n    location_bin  -0.0181    0.1986 461    -0.0910  0.9275
          wm_norm education_years   0.0198    0.0021 459     9.3059  0.0000
          wm_norm        gender_n  -0.0277    0.0146 459    -1.8919  0.0591
          wm_norm             age  -0.0061    0.0007 459    -8.5939  0.0000
              dpo         wm_norm   1.5057    0.2610 457     5.7688  0.0000
              dpo             age   0.0201    0.0043 457     4.6820  0.0000
              dpo        gender_n   0.0802    0.0821 457     0.9763  0.3294
              dpo education_years   0.0884    0.0134 457     6.5909  0.0000
              dpo    location_bin  -0.3379    0.0758 457    -4.4585  0.0000
  Std.Estimate    
             - ***
             - ***
             - ***
       -0.0073    
       -0.1084   *
        -0.005    
        0.4191 ***
       -0.0649    
       -0.3744 ***
         0.318 ***
         0.261 ***
        0.0397    
        0.3947 ***
       -0.1787 ***

  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05

---
Individual R-squared:

         Response     method R.squared
  education_years nagelkerke      0.76
              age       none      0.01
         gender_n nagelkerke      0.00
          wm_norm       none      0.50
              dpo       none      0.32

Figure S93. The complex path model. Figure generated using R version 4.3.3 (2024-02-29)

3.3.3.2.8 Conclusions

dpo behaves relatively similarly to dpr, and we compared (using AIC) the two (more meaningful here) using the multiple regression model ΔAIC(dpr, dpo) = 117.0, and the complex path model ΔAIC(dpr, dpo) = 122.7, both suggesting that dpr (the ‘reduced’ dataset) fits the data worse than dpo (the ‘original’ dataset). We also compared dpo and pco using the multiple regression model ΔAIC(%cr, d’) = -1987.7, and the complex path model ΔAIC(%cr, d’) = -1828.3, both suggesting that %cr fits the data much better than d’.

Finally, let’s compare both measures on both datasets:

Table S26. Model comparison (using AIC) between both measures on both datasets.
Measure 1 Measure 2 ΔAIC(multiple regression) ΔAIC(complex path model)
pcr dpr -2136.0 -1837.2
pcr pco -31.3 113.8
pcr dpo -2019.0 -1714.5
dpr pco 2104.7 1951.1
dpr dpo 117.0 122.7
pco dpo -1987.7 -1828.3

It can be seen that the ordering (taking into account various caveats) is: pcr > pco >> dpo > dpr, suggesting that, indeed, the % correct responses on the ‘recoded’ dataset might be the best choice.

3.3.3.3 Bias c (recoded)

The bias c on the ‘recoded’ dataset (cbr from c bias recoded) shows some clustering within families (18.2%), but no effects of generation nor of location. While the clustering within families is potentially interesting (and might suggest that the bias has a familial component, which may have shared environmental and/or genetic components) we cannot really model it here due to the massive loss of data (dropping from 489 to 91 participants).

Interestingly, including all the potential predictors (and their interactions) in a multiple linear regression results in a model that is not better than the null model with no predictor (F(31)=1.1, p=0.377), suggesting that the bias c is idiosyncratic, probably influenced by personality, genetic and/or cultural variables that we did not measure here.

3.3.3.4 Bias c (original)

The bias c on the ‘original’ dataset (cbo from c bias original) shows some clustering within families (12.3%), but no effects of generation nor of location. As above, while the clustering within families is potentially interesting we cannot really model it here due to the massive loss of data.

Including all the potential predictors (and their interactions) in a multiple linear regression model results, after manual simplification, in only years of education having a positive significant effect (i.e., participants with more years of education have a tendency to answer ‘same’ which might simply say something about how they treat the “weird” items):


Call:
lm(formula = cbo ~ education_years, data = d)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.98951 -0.25221  0.00983  0.24083  1.24920 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)     0.389935   0.028387  13.736  < 2e-16 ***
education_years 0.020921   0.003914   5.345 1.43e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.3553 on 461 degrees of freedom
Multiple R-squared:  0.05835,   Adjusted R-squared:  0.0563 
F-statistic: 28.56 on 1 and 461 DF,  p-value: 1.428e-07

3.3.4 Comparing the “original”, “recoded” and “removed” results

Here we compare side-by-side the complex path models (with linear instead of Beta regressions and showing the unstandardized coefficients) for the original (“o”), recoded (“r”) and removed (“x”) datasets.

3.3.4.1 % correct responses

Figure S94. The complex path model for pco. Figure generated using R version 4.3.3 (2024-02-29)

Figure S95. The complex path model for pcr. Figure generated using R version 4.3.3 (2024-02-29)

Figure S96. The complex path model for pcx. Figure generated using R version 4.3.3 (2024-02-29)

3.3.4.2 d’

Figure S97. The complex path model for dpo. Figure generated using R version 4.3.3 (2024-02-29)

Figure S98. The complex path model for dpr. Figure generated using R version 4.3.3 (2024-02-29)

Figure S99. The complex path model for dpx. Figure generated using R version 4.3.3 (2024-02-29)

3.3.5 Conclusions

Putting everything together, it seems that:

  • on the ‘recoded’ dataset (i.e., with the “weird” items recoded as ‘same’ items), both measures of performance, the % total correct responses and the bias-free d’, behave in very similar ways, namely:

    • there is no family clustering and no effect of generation (but there is very little data to draw solid conclusions),
    • multiple regression suggests that participants from A perform better across the board, as do males and participants with a higher working memory, while age and years of education have no main effects but are involved in interactions,
    • mediation analyses dissect these, showing that:
      • gender has no direct effect but is fully mediated by education, resulting in a positive total effect,
      • age has both a positive direct effect, and a negative indirect effect (mediated through education), resulting in a very weak negative (formally ns) total effect,
      • A’s better overall performance than that of B (total effect) is due both to a mediated effect (through education, higher by about 1.5 years) and to a strong direct effect (participants from A do the task better),
      • working memory has an own positive effect and also mediates those of age and education,
    • the complex path model confirms these and suggests that the performance on the tone task is simultaneously influenced by multiple factors, directly and indirectly, in particular being increased for the participants with a higher working memory, from A, more educated and/or older.
  • on the ‘original’ dataset, the two measure also behave similarly

  • the % total correct responses on the ‘reduced’ dataset fits the data best (in terms of AIC), so, all in all, we should probably take it as our primary measure.

4 Appendices

4.1 Appendix I

Here we give some technical details, also citing the most important methodological packages we use in this paper.

PCA: we used prcomp(...) in package stats to estimate the PCs (which uses the singular value decomposition method), and fviz_eig(...) and fviz_pca_var(...) in package factoextra (Kassambara & Mundt, 2020).

EFA: we implemented the model fit using cfa(...) and the fit measures using fitMeasures(...) from package lavaan (Rosseel, 2012), and we plotted the fitted models using lavaanPlot(...) from package lavaanPlot (Lishinski, 2021). The preliminary EFA tests use KMO(...) in package psych (William Revelle, 2023), cortest.bartlett(...) also in package psych, and det(cor(...)); the most likely number of latent factors uses several methods implemented by fa.parallel(...) and nfactors(...) in package psych.

CFA: uses factanal(...) from package stats (with the “promax” rotation) and the resulting model was plotted using fa.diagram(...) in package psych.

Mokken: we estimated the Guttman errors using check.errors (...), the scalability coefficients H with coefH(...) and aisp with aisp(...), all in the package mokken (Ark, 2007).

SDT: the cumulative density function (cdf) of the normal distribution was estimated using qnorm(...) in package stats; we estimated d’, β, c, A’ and B’’D using dprime(..., adjusted=TRUE) in package psycho (Makowski, 2018) with the adjustment for extreme values. Please see here and here for visual, non-technical explanations of STD.

Regression: we tested the clustering by family using mixed-effects models as implemented by lmer(...) in package lme4 (Bates, Mächler, Bolker, & Walker, 2015) with p-values as implemented in package lmerTest (Kuznetsova, Brockhoff, & Christensen, 2017), and the intra-class correlation coefficient (ICC) was estimated using icc(...) in package performance (Lüdecke, Ben-Shachar, Patil, Waggoner, & Makowski, 2021). Logistic regression is implemented by glm(..., family=binomial("logit")) or glmer(..., family=binomial("logit")) (as appropriate), linear regression by lm(...) or lmer(...), Poisson regression by glm(..., family=poisson()), and Beta regression bu glmmTMB(..., family=beta_family()) from package glmmTMB (Brooks et al., 2017).

Mediation: we used mediate(...) in package mediation (Tingley, Yamamoto, Hirose, Keele, & Imai, 2014) with 10,000 simulations and heteroskedasticity-consistent standard errors for “classic” mediation modeling, and psem(...) and dSep(...) from package piecewiseSEM (Lefcheck, 2016) for modeling the mediation through a piecewise Structural Equation Modelling approach with d-separation.

Path models: we used the piecewise Structural Equation Modelling approach as implemented by psem(...) in package piecewiseSEM (Lefcheck, 2016) and lavaan (Rosseel, 2012).

4.2 Appendix II

Here we present some examples of responses on an AX task (such as the tone task here) where pcr fails to distinguish clearly different response patterns, but the SDT-derived discrimination and bias do.

Consider an AX task with an equal numbers of signal and non-signal items, #(A ≠ X) = #(A = X), and several types of participant response strategies as given in Table 6. It can be seen that the first 6 strategies, some of which are clearly very different, have very similar % of correct responses (50% or very close to it), showing that relaying on this measure only might hide important inter-individual variation. Second, it can be seen that for the three extreme strategies 1, 3 and 5, the % correct response is exactly 50% (as expected), but also that the sensitivity d’ and bias b are identically 0.0 and -1.0, respectively, while the non-parametric estimates completely fail for strategies 1 and 3; however, criterion location c does differentiate between these three strategies in the correct direction and strength. Slightly relaxing the extreme strategies 1 and 3 and arguably making them more realistic (by allowing a small probability of just 1% of “error”) is enough to make all five measures of sensitivity and bias informative. The 6th strategy is also highly artificial but correctly diagnosed by all 5 estimates. Finally, the last 2 strategies, while also extreme, might reflect actual participant behavior and are correctly diagnosed by the % correct responses and by the parametric estimates (but not by the non-parametric ones, whose estimation mostly fails). Therefore, this suggests that, in general, (a) the % correct responses by itself fails to disambiguate between clearly different strategies, but that (b) the parametric estimates d’ and c, when used together, do capture such differences correctly (the non-parametric estimates should work as well in non-extreme cases).

Table S27. Some examples of reponse patterns (1st column) for a total of 1000 items equally distributed between “signal” (A ≠ X, 500 items) and “non-signal” (A = X, 500 items) that result in very similar % correct responses (3rd column) but that may be disambiguated by SDT-derived measures of discrimination and bias: 4th column gives the parametric estimates d’, β and c, while the last column the non-parametric ones A’ and B’’D, showing a dash if the estimation results in not-a-number (NaN). Please see text for details about these estimates. The 2nd column gives the counts in the order hits:false alarms:misses:correct rejections.
Response pattern Counts % d’ (β, c) A’ (B’’D)
always respond “different” 500:500:0:0 50.0% 0.00 ( 1.00, -3.09) - (-)
99% of the time respond “different” 495:498:5:2 49.7% -0.29 ( 2.00, -2.43) 0.35 (-0.43)
always respond “same” 0:0:500:500 50.0% 0.00 ( 1.00, 3.09) - (-)
99% of the time respond “same” 9:3:491:497 50.6% 0.38 ( 2.38, 2.27) 0.67 ( 0.50)
respond correctly to exactly half and wrongly to the other half 250:250:250:250 50.0% 0.00 ( 1.00, -0.00) 0.50 ( 0.00)
respond randomly throwing a fair coin (i.e., 50%:50% chance) 259:234:241:266 52.5% 0.13 ( 1.00, 0.02) 0.55 ( 0.00)
respond corectly to all items 500:0:0:500 100.0% 6.18 ( 1.00, 0.00) 1.00 (-)
respond incorectly to all items 0:500:500:0 0.0% -6.18 ( 1.00, 0.00) - (-)

5 Session information

CPU: Apple M3 (8 threads)

RAM (memory): 17.2 GB

R version 4.3.3 (2024-02-29)

Platform: aarch64-apple-darwin20 (64-bit)

locale: en_US.UTF-8||en_US.UTF-8||en_US.UTF-8||C||en_US.UTF-8||en_US.UTF-8

attached base packages: tools, parallel, stats, graphics, grDevices, utils, datasets, methods and base

other attached packages: benchmarkme(v.1.0.8), mediation(v.4.5.0), sandwich(v.3.1-1), mvtnorm(v.1.3-3), DHARMa(v.0.4.7), performance(v.0.13.0), glmmTMB(v.1.1.10), lmerTest(v.3.1-3), lme4(v.1.1-36), Matrix(v.1.6-5), DiagrammeR(v.1.0.11), viridis(v.0.6.5), viridisLite(v.0.4.2), sjPlot(v.2.8.17), gplots(v.3.2.0), gridExtra(v.2.3), ggrepel(v.0.9.6), piecewiseSEM(v.2.3.0), lavaanPlot(v.0.8.1), lavaan(v.0.6-19), psycho(v.0.6.1), mokken(v.3.1.2), poLCA(v.1.6.0.1), MASS(v.7.3-60.0.1), scatterplot3d(v.0.3-44), psych(v.2.4.12), factoextra(v.1.0.7), ggplot2(v.3.5.1), reshape2(v.1.4.4), dplyr(v.1.1.4), tidyr(v.1.3.1), pander(v.0.6.5) and knitr(v.1.49)

loaded via a namespace (and not attached): RColorBrewer(v.1.1-3), rstudioapi(v.0.17.1), jsonlite(v.1.8.9), datawizard(v.1.0.0), magrittr(v.2.0.3), TH.data(v.1.1-2), estimability(v.1.5.1), farver(v.2.1.2), nloptr(v.2.1.1), rmarkdown(v.2.29), ragg(v.1.3.3), vctrs(v.0.6.5), minqa(v.1.2.8), effectsize(v.1.0.0), base64enc(v.0.1-3), rstatix(v.0.7.2), htmltools(v.0.5.8.1), forcats(v.1.0.0), curl(v.6.1.0), haven(v.2.5.4), broom(v.1.0.7), Formula(v.1.2-5), sjmisc(v.2.8.10), sass(v.0.4.9), KernSmooth(v.2.23-26), bslib(v.0.8.0), DiagrammeRsvg(v.0.1), htmlwidgets(v.1.6.4), plyr(v.1.8.9), emmeans(v.1.10.6), zoo(v.1.8-12), cachem(v.1.1.0), TMB(v.1.9.16), igraph(v.2.1.2), iterators(v.1.0.14), lifecycle(v.1.0.4), pkgconfig(v.2.0.3), sjlabelled(v.1.2.0), R6(v.2.5.1), fastmap(v.1.2.0), rbibutils(v.2.3), digest(v.0.6.37), numDeriv(v.2016.8-1.1), rsvg(v.2.6.1), colorspace(v.2.1-1), textshaping(v.0.4.1), Hmisc(v.5.2-2), ggpubr(v.0.6.0), labeling(v.0.4.3), httr(v.1.4.7), abind(v.1.4-8), mgcv(v.1.9-1), compiler(v.4.3.3), doParallel(v.1.0.17), withr(v.3.0.2), htmlTable(v.2.4.3), backports(v.1.5.0), carData(v.3.0-5), ggsignif(v.0.6.4), sjstats(v.0.19.0), gtools(v.3.9.5), caTools(v.1.18.3), pbivnorm(v.0.6.0), foreign(v.0.8-88), nnet(v.7.3-20), glue(v.1.8.0), quadprog(v.1.5-8), nlme(v.3.1-166), grid(v.4.3.3), checkmate(v.2.3.2), cluster(v.2.1.8), generics(v.0.1.3), lpSolve(v.5.6.23), gtable(v.0.3.6), data.table(v.1.16.4), hms(v.1.1.3), car(v.3.1-3), foreach(v.1.5.2), pillar(v.1.10.1), stringr(v.1.5.1), benchmarkmeData(v.1.0.4), splines(v.4.3.3), lattice(v.0.22-6), survival(v.3.8-3), tidyselect(v.1.2.1), reformulas(v.0.4.0), V8(v.6.0.0), stats4(v.4.3.3), xfun(v.0.50), MuMIn(v.1.48.4), visNetwork(v.2.1.2), stringi(v.1.8.4), yaml(v.2.3.10), boot(v.1.3-31), evaluate(v.1.0.3), codetools(v.0.2-20), tibble(v.3.2.1), cli(v.3.6.3), rpart(v.4.1.24), parameters(v.0.24.1), xtable(v.1.8-4), systemfonts(v.1.1.0), Rdpack(v.2.6.2), munsell(v.0.5.1), jquerylib(v.0.1.4), Rcpp(v.1.0.14), ggeffects(v.2.0.0), coda(v.0.19-4.1), bayestestR(v.0.15.0), bitops(v.1.0-9), scales(v.1.3.0), insight(v.1.0.1), purrr(v.1.0.2), rlang(v.1.1.4), multcomp(v.1.4-26) and mnormt(v.2.1.1)

References

Ark, L. A. van der. (2007). Mokken Scale Analysis in R. Journal of Statistical Software, 20, 1–19. https://doi.org/10.18637/jss.v020.i11
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01
Brooks, M. E., Kristensen, K., van Benthem, K. J., Magnusson, A., Berg, C. W., Nielsen, A., … Bolker, B. M. (2017). glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling. The R Journal, 9(2), 378–400. https://doi.org/10.32614/RJ-2017-066
Dima, A. L. (2018). Scale validation in applied health research: Tutorial for a 6-step R-based psychometrics protocol. Health Psychology and Behavioral Medicine, 6(1), 136–161. https://doi.org/10.1080/21642850.2018.1472602
Donohue, C., & Wu, M. (2013). F0 and aspiration in Kam: Caught in the beginning. Proceedings of the International Conference on Phonetics of the Languages in China (ICPLC-13). Retrieved from https://www.researchgate.net/publication/341321803_F0_and_aspiration_in_Kam_Caught_in_the_beginning
Kassambara, A., & Mundt, F. (2020). Factoextra: Extract and visualize the results of multivariate data analyses. Retrieved from https://CRAN.R-project.org/package=factoextra
Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1–26. https://doi.org/10.18637/jss.v082.i13
Lefcheck, J. S. (2016). piecewiseSEM: Piecewise structural equation modelling in r for ecology, evolution, and systematics. Methods in Ecology and Evolution, 7(5), 573–579. https://doi.org/10.1111/2041-210X.12512
Lishinski, A. (2021). lavaanPlot: Path diagrams for ’lavaan’ models via ’DiagrammeR’. Retrieved from https://CRAN.R-project.org/package=lavaanPlot
Lüdecke, D., Ben-Shachar, M. S., Patil, I., Waggoner, P., & Makowski, D. (2021). performance: An R package for assessment, comparison and testing of statistical models. Journal of Open Source Software, 6(60), 3139. https://doi.org/10.21105/joss.03139
Macmillan, N. A., & Creelman, C. D. (1991). Detection theory: A user’s guide. New York, NY, US: Cambridge University Press.
Makowski, D. (2018). The psycho package: An efficient and publishing-oriented workflow for psychological science. Journal of Open Source Software, 3(22), 470. https://doi.org/10.21105/joss.00470
Pallier, C. (2002). Computing discriminability and bias with the R software. Retrieved from https://www.pallier.org/pdfs/aprime.pdf
R Core Team. (2023). R: A language and environment for statistical computing. Retrieved from https://www.R-project.org/
Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. https://doi.org/10.18637/jss.v048.i02
Tingley, D., Yamamoto, T., Hirose, K., Keele, L., & Imai, K. (2014). mediation: R package for causal mediation analysis. Journal of Statistical Software, 59(5), 1–38. Retrieved from http://www.jstatsoft.org/v59/i05/
William Revelle. (2023). Psych: Procedures for psychological, psychometric, and personality research. Retrieved from https://CRAN.R-project.org/package=psych
Wong, P. C. M., Kang, X., Wong, K. H. Y., So, H.-C., Choy, K. W., & Geng, X. (2020). ASPM-lexical tone association in speakers of a tone language: Direct evidence for the genetic-biasing hypothesis of language evolution. Science Advances, 6(22), eaba5090. https://doi.org/10.1126/sciadv.aba5090
Wu, M. (2018). A Grammar of Sanjiang Kam. München, Germany: LINCOM Gmbh.
Yip, M. (2002). Tone. Cambridge University Press, UK.

  1. This is an example note. Click the symbol at the end of the note to go back to where the note is called in the text.↩︎

  2. Please note that some of the models used here are computationally expensive, and even compiling this Rmarkdown script might require a relatively powerful machine. To help with this, and to ensure full replicability of our results, we have cached some of these expensive sections in the cached_results folder as XZ-compressed RData files. However, it might happen that versions of some of the packages different from those that we used here might not be fully compatible with the saved RData files, resulting in errors compiling this Rmarkdown script or errors displaying/plotting the results. In this case, we recommend using the exact same versions of R and of the packages that we used (listed in the Session information), or, if not possible, the deletion of the offending RData files and the full recompilation of the Rmarkdown script (which is smart enough to re-generate only those missing cached results).↩︎